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Preface 


The objective of this book is to synthesize and explore current theory and prac- 
tice of molecular evolution and systematics focusing on birds. Chapters are written 
by active practitioners discussing current controversies, demonstrating methods, re- 
viewing new findings, and assessing directions for future research. 

As indicated by the title, Avian Molecular Evolution and Systematics, studies of or- 
ganismal phylogeny and of evolution at the molecular level have become closely 
linked. This stems from widespread use of molecular characters that are believed to 
be homologs (similar due to common descent) in phylogenetic inference and from 
reciprocal use of phylogenetic trees in studying the evolution and hypotheses of 
homology for the characters themselves. The mutually informing nature of organ- 
ismal and molecular evolution (Fig. 1) is general, applying to all of life and across 
taxonomic levels. This generality bodes well for the field, signifying a shared re- 
search agenda for many evolutionary biologists, in using analyses of molecular evo- 
lution to inform phylogenetic analyses and vice versa. Indeed, a great deal has been 
learned about the evolution of both organisms and molecular sequences during the 
past decade in this fashion. 

However, all change at the molecular level is not so impervious to convergence, 
natural selection, varying functional constraints, or chance events as to provide a 
linear measure of the passage of time or of relatedness of taxa. It is increasingly clear 
that evolution of some molecular characters can be as quirky and unique to indi- 
vidual lineages as that seen for some phenotypic characters. The need to reconcile 
evolutionary inferences across numerous molecular and phenotypic analyses pres- 
ents biologists with both challenges and opportunities. For example, assessing dis- 
cord between molecular and morphological analyses challenges biologists to pro- 
vide well-corroborated phylogenies based on both molecular and nonmolecular 
characters, and provides the opportunity to learn about the different constraints on 
change for different data sets. Assessing discord between gene trees and species trees 
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FIGURE 1 Diagram indicating the mutually informing nature of hypotheses of organismal phylogeny 
and processes of molecular evolution. 


provides the same challenge, as well as the opportunity to learn about potential 
character divergence within and among populations prior to speciation. A degree 
of uncoupling for evolutionary change across data sets can complicate recovery of 
the actual phylogenetic pattern; however, this reflects differences in evolutionary 
processes operating at different levels of organization and on different character sets, 
and an understanding of both process and pattern is key in a comprehensive view 
of evolution (Fig. 1). 

Birds have long been a source of insight into the workings of nature. This may 
be attributed, at least in part, to birds being both conspicuous and widespread, at- 
tracting numerous field observers. This has in turn yielded detailed understanding 
of many facets of avian behavior, distribution, and ecology. As an indication of the 
historical role of ornithology, observations of birds figure prominently in Aristotle's 
4th-century BCE History of Animals, in which he lays the groundwork for the even- 
tual organization of biology into physiology, morphology, systematics, embryology, 
ethology, and ecology. In the 13th century, Frederick II was an ardent natural his- 
torian of birds and falconer and pioneered a return to direct observations in seeking 
explanations for the natural world, when nearly all others sought explanations in 
the revealed word of the churches. Darwin's observations of finches on the Gala- 
pagos Islands were vital in development of thought on evolution by means of natural 
selection. In recent times, birds have been central to work in many fields relevant 
to evolutionary biology, including studies of species formation and species defini- 
tion; comparative morphology, physiology, and endocrinology; studies of mating 
systems and reproductive strategies; roles of kinship in evolution; population and 
community ecology; effects of environmental change and conservation strategies 
for populations and species; and the roles of vicariance and dispersal in biogeogra- 
phy. As a consequence, researchers undertaking new molecular studies of birds have 
available a wealth of background material on comparative avian biology to provide 
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context and inform interpretation of the molecular findings. Thus, there is good 
reason to be optimistic about the prospects for further insight into evolutionary 
patterns and processes to be gained from the continued study of birds. 

This volume had its beginnings in the symposium "Avian Molecular Evolution" 
held at the 113th stated meeting of the American Ornithologist Union, August 
15-19, 1995 in Cincinnati, Ohio. Six of the chapter authors (Edwards, McDonald, 
Mindell, Quinn, Sheldon, Zink) gave presentations in the symposium, and subse- 
quently, seven others were invited to contribute chapters along with their collabor- 
ators. As organizer and editor of this book, I have attempted to seek some balance 
in coverage of topics, taxa, and taxonomic levels. However, this volume is not in- 
tended to be encyclopedic in its coverage, being restricted by the current interests 
and expertise of the various authors. It is my hope that this volume will stimulate 
inquiry and promote understanding among researchers and students working in this 
area, as well as provide specialists and nonspecialists alike with useful topic reviews 
and empirical examples. 

The chapters are divided into two sections: (I) Molecular Sequences and Evolu- 
tionary History in Birds and (II) Applying Phylogeny and Population Genetics to 
Broader Issues. Authors focusing on the evolution and utility of molecular markers 
consider current understanding of the evolution of the mitochondrial genome in 
birds and closely related vertebrates (Quinn), the inherent difficulties and applica- 
tions for nuclear DNA microsatellites (McDonald and Potts), the applicability of 
mitochondrial control region sequences to studies of population structure (Baker 
and Marshall), and the range of taxonomic resolution for mitochondrial cyto- 
chrome b (Moore and DeFilippis). Investigators consider methodological issues in 
systematics and variously present new data and analyses of phylogeny for select spe- 
cies or populations of Charadriiformes, Apodiformes, and Passeriformes (Baker and 
Marshall, Edwards, Sheldon and Whittingham, Zink); for species of Piciformes 
(Moore and DeFilippis), Gruiformes (Houde et al), Pelecaniformes (Siegel- 
Causey), and ratites (Lee et al; Cooper); and for Falconiformes, Strigiformes, An- 
seriformes, Galliformes, Turnix, Opisthocomus, and Phoenicopterus (Mindell et al.). 
Application of phylogeny and population genetics studies to broader issues include 
an assessment of the relevance of population-level processes to phylogeny at the 
species-level and above (Edwards); uses of phylogenetic hypotheses in studying evo- 
lution of behavior, morphology, and ecology (Sheldon and Whittingham); patterns 
of geographic variation and their potential causes (Zink); speciation (Roy et al.); and 
paleoecology and conservation biology using DNA from extinct taxa (Cooper). 

All chapters have gone through a process of peer review, and I am extremely 
grateful to the following persons for insightful reviews of one or more chapters: 
Marc W. Allard, Jeremy Austin, John M. Bates, Anthony H. Bledsoe, Scott V. Ed- 
wards, Frank B. Gill, John Harshman, Peter Houde, Arnold G. Kluge, Thomas D. 
Kocher, Carey W. Krajewski, Mary C. McKitrick, Axel Meyer, William S. Moore, 
Robert B. Payne, Richard O. Prum, Thomas W. Quinn, Frederick H. Sheldon, 
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Michael D. Sorenson, Robert K. Wayne, and David F. Westneat. I thank Cyndy 
Sims Parr for valuable assistance in preparation of the index, and I am grateful to 
Jonathan Higgins for preparation of the cover illustration. Finally, I thank Margaret, 
June, and Eugene Mindell for their constant support and encouragement, which 


were instrumental in completion of this project. 
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4 T. W. Quinn 
I. INTRODUCTION 


Advances in molecular techniques, particularly the development of the polymerase 
chain reaction (PCR; Mullis et al., 1986), have made studies of vertebrate genomes 
increasingly practical, and have obviated the requirement for fully equipped mo- 
lecular biology laboratories. Among ornithologists, this has resulted in an explosive 
increase in studies of systematics and population genetics. To date, many of these 
studies have focused on the mitochondrial genome, mainly at the level of DNA 
sequence determination. While several chapters in this book illustrate the enormous 
value of primary mitochondrial DNA (mtDNA) sequence in learning about the 
evolutionary history of a group of organisms, this chapter is meant to provide an 
overview of the evolution of the avian mitochondrial genome from a broad per- 
spective, mainly at a level of organization above the primary sequence level. An 
attempt is made to illustrate similarities and differences between the mitochondrial 
genomes of Aves and other vertebrate classes, and to point out how some of these 
differences have provided unique opportunities to probe deep phylogenetic ques- 
tions. It is also emphasized that the avian nuclear genome contains homologs to 
mitochondrial sequences in at least some species, and this may have serious impli- 
cations for the interpretation of mtDNA data sets used in population genetic and 
systematic studies. At the same time, such occurrences present valuable new oppor- 
tunities to gain a better understanding of sequence evolution in both mitochondrial 
and nuclear genomes. 


П. MITOCHONDRIA: AN ANCIENT LEGACY 


In 1922, Wallin proposed that mitochondria arose by endosymbiosis, an idea that 
was championed and expanded on by Margulis (1970). They proposed that mito- 
chondria originated when a protoeukaryotic cell engulfed or was penetrated by an 
aerobic bacterium. The endosymbiont hypothesis was initially treated with much 
skepticism, but it has subsequently received strong support and is now widely ac- 
cepted (Gray, 1983), with debate now centering on whether mitochondria are 
monophyletic or polyphyletic in origin (see review in Gray, 1989) and on the iden- 
tity of the original symbionts (Yang et al., 1985; Cedergren et al., 1988; Andachi 
et al., 1989; Cardon et al., 1994). The acquisition of such an aerobic endosymbiont 
is among the most important events in the history of life, for the descendants of 
those associations now comprise almost all eukaryotic life, both single-celled and 
multicellular. 

Among a number of traits that are indicative of this evolutionary heritage is the 
fact that mitochondria carry their own (usually circular) independently replicating 
genomes. The small size of this genome in animals relative to that of extant bacteria 
(approximately 1% as large) is presumably the result of elimination of genes that 
were redundant with those in the nuclear (host) genome, and of the occasional 
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transfer of functional genes from the mitochondrial genome to the nuclear genome 
in our endosymbiotic ancestors [see, for example, Schuster and Brennicke (1987) 
and Nugent and Palmer (1991)]. The discovery of a circular genome within the 
mitochondria of most eukaryotes not only bolstered the endosymbiont hypothesis, 
but it identified a small and easily isolated source of genetic information outside of 
the large and complex nuclear compartment. 


III. THE VERTEBRATE MITOCHONDRIAL GENOME 


Since the development of recombinant DNA techniques, evolutionary biologists 
have made the vertebrate mitochondrial genome one of the most extensively stud- 
ied on the planet. It is undeniable that a major factor in the initial selection of 
mtDNA for evolutionary studies was the relative ease with which it could be puri- 
fied and manipulated in the laboratory, owing to its high copy number and super- 
coiled conformation, which allows separation from linear (nuclear) DNA. Protocols 
for such isolations are included in Lansman et al. (1981) and in Dowling et al. (1990). 
Robin and Wong (1988) estimated that there are 800 mitochondrial genomes per 
cell, and an average of 2.6 genomes per mitochondrial organelle, within cultured 
human lung fibroblast cells, while Michaels et al. (1982) estimated 2600 genome 
copies per cell in primary bovine tissue culture cells. Estimates for other tissue types 
vary widely. 

Once purified, mtDNA sequence differences between species or individuals can 
be inferred indirectly using restriction endonucleases to generate discrete fragments 
that can then be compared via electrophoresis through agarose gels, or directly by 
DNA sequencing. The polymerase chain reaction (PCR) has increasingly replaced 
such purification methods by allowing the direct amplification and sequencing of 
mtDNA from unpurified sources. However, as nuclear copies of mitochondrial 
genes have been noted in birds, use of direct purification methods will be of con- 
tinuing value (see Section VI). 

As some of the first evolutionary studies using purified mtDNA were completed, 
it became apparent that, besides its ease of isolation and small genome size, there 
were many other advantages to studying the mitochondrial genome. First, it is ma- 
ternally inherited (Lansman et al., 1983). Second, there is no direct evidence that it 
can recombine with other mtDNA molecules (Clayton, 1982, 1992; Hayashi et al., 
1985). This means that vertebrate mtDNA is passed on through female lineages in 
a clonal fashion with no horizontal “mixing,” and this makes it more straightfor- 
ward to reconstruct an evolutionary history of this molecule than for the nuclear 
genome. While Gyllensten et al. (1991) have detected low levels of paternal “leak- 
age" of mtDNA between two species of mouse, whether such a finding can be 
extrapolated to intraspecific leakage is still not clear. Kondo et al. (1990) showed 
that biparental inheritance occurred in interspecific Drosophila crosses, but did not 
occur when intraspecific crosses were done. Third, at the sequence level, mtDNA 
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has been shown to evolve rapidly relative to DNA in the nuclear genome (Brown 
et al., 1979). This is an asset for population studies, but eventually becomes a liability 
as the depth of phylogenetic comparison increases. 

The first mitochondrial genomes to be sequenced in their entirety include hu- 
man (Anderson et al., 1981), mouse (Bibb et al., 1981), cow (Anderson et al., 1982), 
and frog (Xenopus; Roe et al., 1985). Not only was the gene content found to be 
identical, but the gene order was the same, and this, in conjunction with the same 
finding in bony fish, led to the assumption that all vertebrate mt DNA genomes were 
identical in both respects (Johansen et al., 1990). As greater numbers of genomes 
have been sequenced, it is apparent that gene order, especially of tRNA genes, is not 
constant. For example, among tetrapods reports of tRNA transposition have been 
made for marsupials (Pääbo et al., 1991; Janke et al., 1994), frogs (Rana; Yoneyama, 
1987), and crocodiles (Kumazawa and Nishida, 1995; Quinn and Mindell, 1996). 
Lee and Kocher (1995) have shown that in an outgroup to the Osteichthyes, the sea 
lamprey, there have been several changes in gene order near the putative control 
region, some of which have included tRNAs. 

All vertebrate mtDNAs contain 22 tRNAs, 13 protein-coding regions, 2 rRNAs, 
and 1 or 2 (lamprey) large noncoding regions. The genome is arranged in an effi- 
cient manner. Introns are absent, and intergenic spacers are small, typically less than 
10 bp. In some cases genes overlap in different reading frames, and there has even 
been a reduction in the size of some stop codons to one or two bases (T or TA). 
Such codons are posttranscriptionally completed with the addition of a 3’ poly(A) 
sequence (Ojala et al., 1981). These characteristics of vertebrate and many nonver- 
tebrate mitochondrial genomes have led to the proposal that intermolecular selec- 
tion for compactness could be the "driving force" that results in such an economical 
gene arrangement, assuming that smaller molecules replicate more rapidly than 
larger ones, all else being equal (Wallace, 1982; Rand and Harrison, 1989; Kurland, 
1992). Evidence in support of this comes from observations that human mitochon- 
drial genomes carrying deletions can, between generations, increase in frequency 
relative to full-sized genomes both in vitro (Yoneda et al., 1992) and in vivo (Larsson 
et al., 1990; Kobayashi et al., 1992). 


IV. AVIAN MITOCHONDRIAL GENOMES 
A. An Altered Gene Order throughout Aves 


In 1990, Desjardins and Morais published the first complete sequence of an avian 
mitochondrial genome. This resource, in conjunction with increasingly sophisti- 
cated methods of accessing worldwide repositories of DNA sequence data, has made 
it possible to use PCR to study almost any part of the genome in any avian species. 
It also provided our first clear view of the similarities and differences in genome 
structure of birds compared with other tetrapods. The most striking feature of the 
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FIGURE 1.1 Schematic diagram of (a) the gene content and order of the chicken mitochondrial 
genome (Desjardins and Morais, 1990) and (b) that of placental mammals and Xenopus laevis. tRNAs are 
unlabeled except in the WANCY region. Abbreviations of gene names are those in common use except 
for the following: Cyt. b, cytochrome b; 12S, 12S rRNA; 16S, 16S rRNA; Control, control region; O;, 
light-strand origin of replication; Он, heavy-strand origin of replication. These origins are unmapped 
in chicken. There is a rearrangement adjacent to the control region in chicken relative to the placental 
mammal and Xenopus (highlighted with overline). A short sequence with strong secondary structure 
exists within the WANCY region in placental mammals and Xenopus (labeled O;) but is absent in 
chicken [a single base is present between tRNA^" (М) and (МА (C) in chicken]. The overlap be- 
tween ATPase 6 and ATPase 8 is shown by two lines in close proximity. The proportionate sizes of genes 
within the chicken genome are accurate, except all tRNAs were drawn to an (average) length of 73 bp. 
The genes shown in (b) have been drawn to the same size as in chicken, although slight differences in 
gene lengths do exist. 


chicken mitochondrial genome is that, while it contains the same genes as all other 
vertebrates, the order of those genes is unique (Fig. 1.1). Additional studies have 
made it apparent that this altered gene order is conserved across a wide taxonomic 
diversity of birds [Desjardins et al., 1990 (duck); Quinn and Wilson, 1993 (goose); 
Wenink et al., 1993 (dunlin); Wenink et al., 1994 (turnstone); Moum and Johansen, 
1992 (murre); Moum et al., 1994 (gull, sandpiper)]. While tRNA rearrangements 
have been observed in other vertebrates, the avian and lamprey genomes are thus 
far the only ones known to have undergone major rearrangements that include 
protein-coding genes. 

The mechanism leading to such rearrangements is unknown. Moritz et al. (1987) 
proposed that mitochondrial gene order could be changed without intermolecular 
recombination if tandem duplication of part of a genome was followed by deletions 
that include at least some ofthe "parental" copies. Given the documentation oflarge 
tandem duplications in reptiles (Moritz and Brown, 1986, 1987; Zevering et al., 
1991), amphibians (Wallis, 1987), and mammals (Poulton et al., 1989), and large 
deletions in a similar variety of taxa including reptiles (Moritz and Brown, 1986, 
1987), mammals (Shoffner et al., 1989; Zeviani et al., 1989; Tanaka et al., 1989), and 
birds (Edwards, 1992; also see Avise and Zink, 1988, for another likely candidate), 
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this seems to be a reasonable hypothesis for the avian rearrangement. In the ancestor 
to birds, beginning with a gene order as found in mammals and Xenopus, this would 
require a single duplication of several genes located downstream of the heavy strand 
origin of replication (Он), followed by at least two independent deletion events 
(Desjardins and Morais, 1990; Quinn and Wilson, 1993; Fig. 1.2c). Pääbo et al. 
(1991) and Kumazawa and Nishida (1995) documented gene rearrangements 
among tRNA gene clusters in marsupials and crocodiles, respectively, and in each 
case intergenic spacers among the rearranged genes were considerably larger than 
in vertebrates without such rearrangements. In both cases, the authors point out 
that this observation is consistent with rearrangement via duplication, assuming that 
the large intergenic spacers are the remnants of the "extra" copies that are in the 
process of being (gradually) deleted. In these and in the avian rearrangement, a 
"replicative race" could be the force driving subsequent reduction of enlarged ge- 
nomes back to the original size. 

Both duplication and deletion could occur as the result of slippage during repli- 
cation (Streisinger et al., 1966; Levinson and Gutman, 1987). A convincing model 
that explains the rapid generation of length variation among relatively short tandem 
repeats within the control region of heteroplasmic sturgeon has been presented by 
Buroker et al. (1990). They proposed that frequent misalignment of extending nas- 
cent strands with parental strands could be enhanced by the formation of secondary 
structures within the tandem repeats of either strand when single stranded. Because 
the D-loop region is triple stranded, competitive binding of two H strands (nascent 
and parental) to a single parental L strand could alternately expose both H strands 
to a single-stranded state, thereby facilitating the formation of such single-stranded 
structures. In birds, similar short repeats in the control region and in association 
with heteroplasmy have been shown by Berg et al. (1995); also see Avise and Zink 
(1988). 

For larger duplications and deletions such as envisaged here for the avian re- 
arrangement, the development of a convincing model seems more problematic. 
Dispersed regions of homology might allow misalignment between parent and nas- 
cent strands of DNA during replication as shown in Fig. 1.2a and b, but here the 
mismatch must occur over longer distances, and the role of secondary structure in 
promoting mismatch is not as obvious. Perhaps competitive hybridization can oc- 
cur between the nascent and parental heavy strands in binding to the light strand. 
This would free the nascent heavy strand for reinvasion in a new location. The fact 
that replication is approximately 200 times slower in mtDNA than in Escherichia coli 
(Clayton, 1982) may also facilitate such events. However, tandem duplications in 
humans (Poulten et al., 1989) and in Cnemidophorus (Moritz and Brown, 1986, 
1987) typically include the entire control region and hence span both sides of Oy, 
an observation not easily explained by such a model. 

Most of the Cnemidophorus duplications include flanking tRNAs, perhaps impli- 
cating their involvement in gene duplication (Moritz and Brown, 1986, 1987). Des- 
jardins and Morais (1990) suggest that the cotransposition of tRNA genes associated 
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FIGURE 1.2 A model explaining how the avian rearrangement could be produced without intergen- 
omic recombination. Cyt b, cytochrome b; E, T, P and Е (respectively), ЕМАС", tRNA ™, СМА", 
апа tRNA", (a) DNA replication of a mitochondrial genome with the same gene order as observed in 
placental mammals begins from the heavy strand origin (solid circle), presumably via a primer molecule, 
and proceeds to the ND5-ND6 boundary (top). Melting and reannealing of the extending strand (pos- 
sibly during competitive annealing of displaced parental H-strand) place the 3' end of the nascent strand 
in a new position with some sequence similarity. Extension resumes to completion (only a subsection of 
genome is shown). The resulting molecule includes a large duplication that is resolved after one more 
round of replication (bottom). (Drawing is modified from Fig. 9a-c in Quinn and Wilson, 1993.) 
(b) Beginning with the duplicated molecule generated in part (a), a deletion occurs during replication 
when melting and reannealing occur such that an area at the 3' end of the extending nascent molecule 
mispairs with a region of sequence similarity further downstream, perhaps by competitive invasion of a 
small region within the parental molecule. Extension resumes to completion to generate a large deletion 
that is resolved after one more round of replication (bottom). (c) Overall, the two deletions ofthe shaded 
areas would be neutral or selectively favored, and would eventually return the gene copy number to one 
while producing the avian gene order [the rightmost deletion was modeled in (b)]. Although only two 
deletions are shown, this represents the minimum number possible using this model. The deletions were 
not likely to have taken place simultaneously, and would probably take many generations to be com- 
pleted. (Modified from Fig. 9e in Quinn and Wilson, 1993.) 
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with one end of cytochrome b (or alternatively, with NID6, depending on which 
one was transposed) could be relevant to the avian rearrangement. Edwards (1992) 
and Edwards and Wilson (1990) documented a duplication that includes the frag- 
ment tRNA"?-ND6-tRNA C" in Pomatostomus (babblers). The duplicated region is 
situated within the 3' end of the control region. Again, this represents a fragment 
flanked by tRNAs, and suggests an alternative mechanism as a possible cause of the 
putative avian duplication. 

Once a duplication is generated, the mechanism of its reduction could also be 
modeled using slipped strand mispairing (Fig. 1.2b; also see Efstratiadis et al., 1980). 
The discovery that two large independent deletions have occurred within the mi- 
tochondrial duplication of Pomatostomus temporalus after it diverged from other Po- 
matostomus species that carry the full complement of pseudogenes described above, 
and that flanking polycytosine tracts are interspersed with guanines in the undeleted 
genomes, led Edwards (1992) to propose the involvement of slipped strand mispair- 
ing. In humans with certain mitochondrial myopathies, mtDNA deletions are often 
flanked by short direct repeats (Shoffner et al., 1989 and references therein). Such 
sites may provide areas across which mispairing of complementary regions could 
occur during DNA replication. To explain some unexpected observations in a series 
of PCR and sequencing experiments, Quinn and Mindell (1996) proposed that 
some mitochondria within a single tuatara (Sphenodon punctatus) sample carried a 
deletion that included cytochrome b, ЕМА", and a small piece of the control 
region. This putative deletion is of particular interest since it encompasses one of 
the two areas shown in Fig. 1.2c as leading to the avian mitochondrial rearrange- 
ment, except that іп tuatara tRNA'* is missing from its "usual" location next to 
cytochrome b. 


B. Other Features of the Avian 
Mitochondrial Genome 


Desjardins and Morais (1990) have compared the avian mitochondrial genome to 
those of mammals and amphibians. Many similarities exist besides the common 
gene content, including (1) in chicken, several genes end with incomplete stop 
codons that are presumably completed by polyadenylation as proposed for mam- 
mals; (2) codon/anticodon rules are the same for those codons present in chicken, 
although the frequencies of use may differ (see Desjardins and Morais, 1990, for 
details); (3) guanine is relatively infrequent at third positions of codons; (4) several 
genes, including ATPase 6 and 8, overlap, by the same amount as in Xenopus, but 
less than in mammals, while others are butt-joined; (5) the control region includes 
a transcriptional promoter, which, as in amphibians, is bidirectional (L’Abbé et al., 
1991), and the O,, (Glaus et al., 1980). 

Several differences can be listed as well. In addition to an altered mitochondrial 
gene order, birds lack the hairpin structure that forms the light-strand origin of 
replication (O,) in mammal and Xenopus (where it is located within a tRNA cluster 
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called the WANCY region between tRNA^" and tRNA €). Thus far, which part 
of the avian genome serves this function remains unknown, although it is intriguing 
to note that Tapper and Clayton (1981) mapped two distinct 5' end positions of the 
human daughter L strand, one of which is located within the complementary se- 
quence to the cRNA** gene anticodon. 

In chicken (Desjardins and Morais, 1990) and in goose (T. W. Quinn, unpub- 
lished), COI has a putative GTG initiation codon that is unusual among vertebrates 
and is unique for this gene. Another feature peculiar to the avian genome is the low 
incidence of thymine at silent positions within coding regions of cytochrome b and 
presumably other protein-coding genes coded on the same strand. In reporting this 
anomaly, Kocher et al., (1989) point out that the extreme compositional bias created 
by this and the guanine deficit mentioned above may make "saturation effects" 
particularly severe in DNA sequence-based phylogenetic studies of birds. 


V. MAJOR GENOMIC FEATURES AS CHARACTERS 
FOR PHYLOGENETIC ANALYSIS 


A. Two Approaches to Phylogenetics Analysis 
Using DNA Sequence Information 


Studies of systematic relationships frequently make use of direct comparison of 
DNA sequence information, from which estimates of branching order can be made, 
using an assortment of methods that include parsimony, maximum likelihood, and 
distance. Such approaches have provided a wealth of new information, and with the 
concurrent refinement of statistical or "quasistatistical" methods, allow the relative 
robustness of conflicting phylogenetic hypotheses to be evaluated. Despite the in- 
creasing sophistication of analytical methods, controversies over alternative phylo- 
genetic hypotheses are not uncommon, especially in cases where several species 
have diverged in close temporal proximity, or in cases where divergences occurred 
far in the past. In the first case, resolution may not even be possible with mtDNA 
as the only source of information because species trees and gene trees may differ 
(see Avise et al., 1987; Avise, 1994; and Chapter 9, by Edwards, in this volume). In 
the second case, resolution may be theoretically possible, but it becomes difficult in 
practice because of problems distinguishing "signal" from "noise." It is this second 
case that is addressed in the following discussion. 

In analyses that attempt to distinguish between alternative branching orders of 
deeply divergent taxa, two problems that frequently arise are sequence alignment 
and sequence saturation. The problems with alignment can be minimized by careful 
choice of the genes/regions being compared. For instance, regions that are evolu- 
tionarily constrained, presumably by function, are easier to align than those that are 
not. Kumazawa and Nishida (1993) present a convincing case for their ability to 
align unambiguously stem regions in mitochondrial tRNA genes of vertebrates. The 


12 T. W. Quinn 


use of conserved protein-coding regions may also circumvent some alignment 
problems. Unfortunately, a balance must be reached between constraint and vari- 
ability. A region that is too constrained will provide few variable or informative sites 
for phylogenetic analysis. In contrast, in a region that is not constrained enough, 
alignment of homologous bases may become difficult because of numerous inser- 
tions or deletions. In many such cases, alignments can be done reasonably well, 
particularly if done conservatively such that ambiguous areas are eliminated from 
the final analysis, preferably by a stated and objective set of rules. 

Once alignment is accomplished successfully, the (perhaps) more difficult prob- 
lem of sequence saturation (multiple substitution) must be considered. Multiple 
substitutions at any single site that occur after two taxa have diverged cannot be 
discerned in a pairwise comparison. In phylogenetic reconstruction, this produces 
noise via homoplasy that can be difficult to distinguish from the phylogenetic signal 
that is generated, for example, in cases where single substitutions provide informa- 
tive synapomorphies. Statistical methods can be useful in clarifying the intensity of 
signal, but in many cases the finding is that the signal-to-noise ratio is too low to 
allow differentiation between competing phylogenetic hypotheses. Methods for in- 
creasing the relative weighting given to sites at which substitutions are infrequent 
and/or substitution types that are relatively rare (e.g., Williams and Fitch, 1990) 
may help to accentuate signal over noise. It is also clear from numerous studies that, 
for a given amount of divergence, saturation effects vary between genes and be- 
tween regions within a single gene, and that careful choice of genes/regions may 
allow phylogenetic signal to be optimized for deeper phylogenetic comparisons 
(e.g., Mindell and Honeycutt, 1990; Kumazawa and Nishida, 1993). Sources of bias 
may also exist, such as differences in base composition between compared taxa. 
These also serve to increase the complexity of deeper phylogenetic comparisons, 
and can lead to misleading statistical support. It does not follow logically that gath- 
ering more sequence information will necessarily alleviate such problems of reso- 
lution if the basis is sequence saturation, unless a qualitatively different gene/region 
is chosen or a weighting scheme that corrects for such biases can be devised and 
justified. A poor signal-to-noise ratio is not alleviated by a larger sample size alone, 
especially where noise is produced in a biased fashion. Regardless of the region 
chosen or analyses used, it is apparent that having a clear understanding of the nature 
of substitutional rates and biases remains critical. Unfortunately, such an under- 
standing is a tedious and difficult goal to attain. 

The use of major genomic rearrangements or other such “genomic landmarks” 
as characters for phylogenetic analysis provides a complementary approach to direct 
sequence comparison that may be unaffected by the problems of alignment or se- 
quence saturation. Such approaches make the seemingly reasonable but unproven 
assumption that major rearrangements are unlikely to occur in the same manner 
more than once in the time period under consideration. In theory, there are a near 
infinite number of “character states” that can exist (versus four or, if gaps are in- 
cluded, five in DNA sequence data), and hence saturation might not occur at this 
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level. However, proof of this point must await further empirical studies, since 
mechanistic or selective constraints may, in fact, limit the number of possible char- 
acter states. For instance, if gene rearrangement requires the involvement of flank- 
ing tRNAs as discussed earlier, the set of possible outcomes of rearrangement 15 
drastically reduced. Similarly, rearrangements may be restricted to those regions 
close to origins of replication, and this would have a similar effect. Selection could 
also play a role. For example, insertions within the coding region of a gene may not 
be evolutionarily “viable.” 

Another difficulty in using genomic landmarks for phylogenetic analysis is that 
the number of informative characters that are available for comparison will be lim- 
ited according to the time scale under consideration and the source and size of the 
genome across which distinguishing features are detected. Certainly for compari- 
sons among vertebrates, this number remains low within the mitochondria genome, 
although it seems likely that more characters will become available from the (large) 
nuclear genome in the near future. For deeper comparisons, more characters are 
available, and mathematical methods to combine information from numerous char- 
acters are being developed (Sankoff et al., 1992). Ultimately, given different advan- 
tages and disadvantages of using nucleotide bases versus genomic landmarks for 
deeper phylogenetic comparisons, one would gain more confidence that a particu- 
lar branching order has been effectively solved when a variety of studies, including 
nonmolecular ones, either demonstrate congruence or can be combined and shown 
to give, collectively, strong support for a particular phylogenetic hypothesis (al- 
though this is an oversimplification in the sense that all methods are not expected 
to perform with uniform accuracy at all depths of comparison). 


B. The Sister Taxon to Birds: Evidence 
from Comparisons of DNA Sequence and 
“Genomic Landmarks” 


Literature concerning the determination of the sister taxon to birds illustrates some 
of the strengths and weaknesses of these two general approaches to phylogenetic 
reconstruction. While the classic view derived primarily from paleontological evi- 
dence places birds and crocodilians as sister taxa (Gauthier et al., 1988; Donoghue 
et al., 1989), controversial morphological analyses by Gardiner (1982) and Lovtrup 
(1985) placed birds and mammals as sister groups. In 1990, Hedges et al. published 
DNA sequence data from the 18S and 28S ribosomal RNA genes of a variety of 
tetrapods, and reviewed and reanalyzed much of the available amino acid sequence 
data sets pertaining to tetrapod phylogeny. Taken individually, studies of various 
genes supported different phylogenetic hypotheses, including a bird—crocodilian 
relationship (histone H2B) and a bird—mammal relationship (S -hemoglobin, myo- 
globin, and 18S rRNA). An additional four data sets (genes) supported differing 
hypotheses depending on the analysis employed. 
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While one interpretation of those molecular data sets supporting the controver- 
sial bird-mammal clade is that they reflect evolutionary relationships, in the case of 
protein-coding genes, it has been argued that selective forces have obscured phylo- 
genetic information (Dickerson and Geis, 1983; Bishop and Friday, 1988). Hedges 
et al. (1990) originally argued that their 18S rRNA nucleotide data set provided at 
least some support to a bird-mammal clade, but that analysis was subsequently chal- 
lenged. Marshall (1992) showed that a different conclusion is reached if weighted 
parsimony is used to correct for substitution bias and frequency, and Eernisse and 
Kluge (1993) questioned various aspects of the data set including alignments, al- 
though they arrived at a conclusion similar to that of Hedges et al. (1990) when just 
the 18S rRNA data set was used. Eernisse and Kluge (1993) went on to consider all 
available evidence pertaining to this phylogenetic question and reached the conclu- 
sion that a sister relationship for birds and crocodilians is best supported. The reader 
is referred to that paper for carefully formulated and clearly expressed suggestions 
on how best to combine the findings of different studies, both molecular and non- 
molecular, and for insightful comments on the analysis of molecular data sets. 
Hedges (1994), using data from a larger number of genes (14), later decided that the 
original support for a mammal- bird clade provided by the 18S rRNA sequence data 
(Hedges et al., 1990) was misleading because of a higher rate of change in the line- 
ages leading to birds and mammals (long branches attract) and/or a common G + C 
substitutions bias. These studies and varying interpretations of a single data set high- 
light the controversies that often arise over how to correct for multiple substitu- 
tions, biases, and alignments in studying deeply divergent taxa with primary se- 
quence information. 

The first published use of major features of the mitochondrial genome for phy- 
logenetic reconstruction involving birds, mammals, and crocodilians was made us- 
ing the presence/absence of O, as a character (Seutin et al., 1994). It was already 
known that structures presumed to be homologous to the human O, existed in a 
variety of mammals (Anderson et al., 1981, 1982; Bibb et al., 1981; Gadaleta et al., 
1989; Árnason et al., 1991; Árnason and Johnsson, 1992), amphibians (Roe et al., 
1985; Yoneyama, 1987), and fish (Johansen et al., 1990; Tzeng et al., 1992). This 
region was also known to be absent in chicken, quail, and duck (Desjardins and 
Morais, 1990, 1991; Ramirez et al., 1993), and hence, in the same sense as discussed 
below for the ND6-tRNA*" region, must have been lost early within the avian 
clade, or perhaps before the common ancestor to Aves split from its sister taxon. 
Seutin et al. (1994) showed that O, was also present in turtle and snake, but was 
absent in crocodilian and tuatara. They interpreted the shared absence of O, in 
Crocodylia and Aves to be a synapomorphy linking those two lineages. 

The absence of O, in tuatara presented an enigma, as, taken in isolation, it places 
it in the same clade with bird and crocodile. The tuatara is usually considered to be 
a sister group to Squamata (snakes and lizards), forming with them, the Lepidosau- 
ria. The authors considered two hypotheses concerning this unexpected result. One 
was that a common ancestor to the [Lepidosauria (Crocodylia-Aves)] clade was het- 
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eroplasmic for the two character states (presence and absence of the O,) and that 
the heteroplasmy was maintained for a significant length of time, and sorted out 
differentially among the Lepidosauria. The second hypothesis considered was that 
tuatara is more closely related to Crocodylia and Aves than it is to snake, although 
they viewed this as unlikely given numerous other synapomorphies defining the 
Lepidosauria. A third unmentioned hypothesis that could be forwarded is that the 
tuatara O, was deleted in an independent event after its ancestor diverged from 
other Lepidosauria. One of the two tRNAs that flank the usual location of O, is 
tRNA. In the tuatara, tRNA” is unusual, as the DHU arm has been reduced to 
a 7-bp loop, apparently without any secondary structure. One could speculate that 
this represents the remnant of a larger deletion that included O, and a portion of 
the adjacent tRNA. If true, this would represent a parallel (independent) deletion. 
This interpretation raises the sobering possibility that even major genomic alter- 
ations may suffer from ambiguity in the correct identification of homologous versus 
nonhomologous evolutionary events. Seutin et al. (1994) also sequenced segments 
of ND2 and COI and found the amino acids to provide information that was also 
supportive of a bird—crocodilian sister relationship. 

Another attempt to determine the sister taxon to birds was made by Quinn and 
Mindell (1996). 'To study the major rearrangement found adjacent to the avian con- 
trol region (Fig. 1.1), they designed conserved primers to amplify and sequence 
gene junctions at either end of the cytochrome b gene in a variety of reptiles (the 
rearrangement was already known to be widespread among birds). While they were 
able to determine gene order by this method, the rearrangement occurred after the 
common ancestor to extant birds had split from the sampled reptilian lineages. Their 
study illustrates a problem with the use of such approaches for phylogenetic recon- 
struction relative to sequencing efforts, namely that the chances that such an event 
will occur along a “defining” internode within a given phylogenetic tree is very 
difficult to assess a priori. Nonetheless, it seems appropriate to investigate such major 
and readily interpretable genomic “landmarks” in an opportunistic fashion, particu- 
larly when they might add information to deeper phylogenetic relationships that 
have proven recalcitrant to other approaches. 

As the divergence between bird and crocodile is believed to precede that be- 
tween bird and dinosaur, this general approach could still provide an interesting 
synapomorphy if DNA remains intact through geological time periods, a conten- 
tious issue. Woodward et al. (1994) have published cytochrome b DNA sequence 
information obtained via PCR amplifications of extracts taken from two 80—85 
million-year-old bone fragments presumed to be from an unidentified species of 
dinosaur. They were unable to show the sequence to be significantly more similar 
to bird or reptile than to mammal, a problem they attributed to the small length of 
sequence obtainable, and to sequence variability within samples, both presumably 
resultant from DNA damage. Subsequent analyses have raised the strong possibility 
that the sequence originates from accidental amplification of contaminating mam- 
malian DNA sequence (Henikoff, 1995; Allard et al., 1995), most likely of human 
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origin (Hedges and Schweitzer, 1995; Zischler et al., 1995b), and probably from 
human nuclear sequences that are homologous to mtDNA (Collura and Stewart, 
1995; Zichler et al., 1995b). 


C. Other Potentially Informative Characters 


With the increasing amount of mitochondrial and nuclear sequence data that is 
available each month, it may soon be possible to explore other types of genomic 
landmarks as potential sources of phylogenetic information at various depths of 
comparison. For instance, unusual codons, such as that proposed by Desjardins and 
Morais (1990) to initiate COI in birds, might be informative. Areas where genes 
overlap, such as ATPase 6 and 8, seem likely candidates for unusual evolutionary 
constraint, and there have been occasional alterations to the overlap. In birds, lam- 
prey, and amphibians, these genes overlap by 10 bp, while in most mammals, the 
overlap is 40—46 bp (30 bp in fin whale). tRNAs may provide useful synapomor- 
phies among vertebrates, as studies of reptile genomes have shown several cases of 
tRNA rearrangement or, assuming that “missing” tRNAs are still present elsewhere 
in the genome, of tRNA movement. For example, Quinn and Mindell (1996) 
showed that tRNA™" is not adjacent to the 3’ end of cytochrome b in tuatara, and 
that a putative pseudogene of tRNA" is at the 5' end of the control region in 
crocodile. Similarly, Kumazawa and Nishida (1995) found another crocodilian 
tRNA rearrangement, and detected the movement of a snake tRNA gene from one 
tRNA cluster to another. Each of these observations adds weight to the idea that 
mitochondrial tRINAs are relatively "mobile" through geological time. Some genes 
have different numbers of codons among different taxa, another potential source of 
characters in cases where insertion/deletion events are relatively rare. For example, 
both placental and marsupial mammals share what appears to be a synapomorphy 
defined by an extra codon near the 5' end of cytochrome b. Finally, Moum et al. 
(1994) used hydropathy profile analysis to propose that an intragenic rearrangement 
may have occurred within the ND6 gene of mammals, raising the possibility that 
more characters of this type will become available as our understanding of protein 
structure improves. 

These various types of genomic features (and more) should be considered as 
candidates for phylogenetic analysis, and several may prove useful at different phy- 
logenetic depths. However, some may also prove to be useless, or even misleading 
for such ventures. The main lesson that emerges from considerations of DNA at its 
most fundamental and arguably least complex level (primary sequence) is that a 
thorough understanding of the forces affecting its evolution is increasingly impor- 
tant to have, but increasingly difficult to attain as the divergence of compared taxa 
deepens. While genomic landmarks are likely to enhance resolution of deeper phy- 
logenetic questions, it also seems likely that similar and perhaps even more complex 
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lessons will emerge from higher orders of DNA "behavior," so extreme caution is 
advised. 


VI. INTERGENOMIC TRANSFER 
OF MITOCHONDRIAL SEQUENCES 


A. Nuclear Homologs of Mitochondrial 
DNA: Not Unusual 


In 1983, DNA segments within the nuclear genome with obvious homology to 
portions of the mitochondrial genome were reported in yeast (Farrelly and Butow, 
1983), locusts (Gellissen et al., 1983), fungi (Wright and Cummings, 1983), and 
humans (Tsuzuki et al., 1983). Lopez et al. (1994) used “пит” as a specific title for 
such sequences in domestic cat, and that term seems appropriate for general usage. 
Numts have been discovered in a variety of other eukaryotes. 

Thorsness and Fox (1990) performed an elegant set of experiments in which they 
manipulated yeast organelles to estimate the amount of mtDNA that can potentially 
enter the nuclear compartment. Plasmid constructs designed to complement certain 
nuclear genetic defects were introduced into the mitochondria of yeast cell strains 
normally lacking in endogenous mtDNA. They detected approximately 2 X 1072 
transfers per cell per generation. This was a convincing demonstration that circular 
DNA molecules can not only move from mitochondrial organelles into the nucleus, 
but that they can do so with high frequency. While their study did not show, and 
probably rarely involved, covalent linkage of the transferred plasmids with the nu- 
clear genome, this last step must be evolutionarily common given the large number 
and diversity of species in which numts have been detected. Among animals, this 
includes sea urchin (Jacobs et al., 1983), locust (Gellissen et al., 1983), rat (Hadler 
et al., 1983; Zullo et al., 1991), akodontine rodents (Smith et al., 1992), “domestic” 
cats (Lopez et al., 1994), goose (Quinn and White, 1987, Quinn, 1992), diving 
ducks (M. D. Sorenson and R. C. Fleischer, personal communication), tapaculo 
species (Arctander, 1995), humans (Tsuzuki et al., 1983; Fukuda et al., 1985; No- 
miyama et al., 1985; Kamimura et al., 1989), and other primates (Van der Kuyl et al., 
1995; Collura and Stewart, 1995). The only vertebrate in which there has been a 
concerted effort to determine the frequency of mitochondrial homologs within the 
nuclear genome, Homo sapiens, has several hundred copies of mitochondrial se- 
quence dispersed throughout the nuclear genome (Fukuda ег al., 1985; Kamimura 
et al., 1989). 

The examples given above include a variety of mitochondrial genes as well as the 
control region. In some cases the transferred fragment exists as a tandem repeat (e.g., 
Lopez et al., 1994) while in others they are dispersed (e.g., Fukuda et al., 1985). 
RNA intermediates were involved in some transfers (Gellissen and Michaelis, 1987; 
Shay and Werbin, 1992), but no such evidence exists for others. Kamimura et dl. 
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(1989) reported the contiguous location of three gene segments that are normally 
found widely separated within the vertebrate mitochondrial genome. The central 
segment was inverted relative to the flanking segments. Taken together, these ob- 
servations imply that a considerable variety of events may lead to the formation of 
numts. 


B. An Avian Numt 


In 1987, Quinn and White extended the documentation of numts to birds. They 
cloned a 5.5-kb HindIII fragment of the lesser snow goose (Anser caerulescens caeru- 
lescens) mitochondrial genome into a plasmid vector, and then hybridized the radio- 
actively labeled clone to genomic Southern blots that contained HindIII-digested 
total DNA extract prepared from snow goose blood. While a 5.5-kb band of DNA 
hybridized with the probe as expected, a 3.6-kb band also showed strong signal on 
an autoradiographic exposure (Fig. 1.3, lane 8). When a probe of the entire mito- 
chondrial genome was prepared and hybridized to the same blot, all expected bands 
of mitochondrial origin showed hybridization, but an extra 3.6-kb band was con- 
sistently seen. To verify that the 3.6-kb band was not of mitochondrial origin, the 
same clone was used to probe a Southern blot prepared with HindlII-digested pu- 
rified mtDNA (Fig. 1.3, lane 2). As expected, a single 5.5-kb mitochondrial band 
showed strong hybridization, with no sign of the 3.6-kb band that hybridized when 
total DNA extract rather than purified mtDNA was used. To show that the 3.6-kb 
band was of nuclear origin, they then prepared a Southern blot that contained 
HindllI-digested DNA samples that were expected to have varying nuclear-to- 
mitochondrial ratios. Avian blood-extracted DNA is relatively rich in nuclear DNA 
owing to the nucleated red blood cells, and liver is relatively rich in mitochondria. 
A third sample highly enriched for mtDNA was prepared by homogenizing liver 
tissue to break open cells, and then centrifuging intact nuclei into a pellet. The 
supernatant, containing the smaller mitochondrial organelles and small amounts of 
nuclear DNA from the fraction of nuclei that invariably burst during such isolation 
procedures, was then extracted by standard methods. These samples were digested 
with HindIII and electrophoresed in order of increasing relative nuclear DNA con- 
tent (enriched sample, liver sample, blood sample). This blot, when probed with 
the 5.5-kb clone, showed the relative intensity of the 5.5-kb mitochondrial band to 
decrease while the relative intensity of the 3.6-kb nuclear band increased (Fig. 1.3, 
lanes 6—8). 

Subsequent sequencing of portions of the 5.5-kb mtDNA clone (Quinn and 
Wilson, 1993; T. W. Quinn, unpublished) showed it to contain, in the same order 
as the published chicken genome (Desjardins and Morais, 1990), part of ND5, cy- 
tochrome b, tRNA™, tRNA”, ND6, КМА", control region, tRNA*, 125 
rRNA, and 165 rRNA. Presumably ЕМА“! lies between 125 rRNA and 16S rRNA 
as in chicken, but this gene was not checked. Thus, the nuclear homolog must 
contain some subset of these sequences. Using PCR, others have also shown evi- 
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FIGURE 1.3 Demonstration of the presence of DNA sequences with homology to mtDNA se- 
quences within the nuclear genome of the lesser snow goose (Anser caerulescens caerulescens). Fragment 
sizes are indicated in kilobases. Lane 1: 0.2 ug of purified mtDNA digested with HindIII, electropho- 
resed through a 1% agarose gel, and visualized with ethidium bromide. Lane 2: a Southern blot prepared 
from the gel shown in lane 1 was hybridized with a radioactively labeled clone of the 5.5-kb mitochon- 
drial HindIII fragment (lane 1). Lanes 3, 4, and 5: HindIII-digested DNA samples electophoresed 
through a 1% agarose gel and visualized with ethidium bromide. The DNA samples originate from 
mitochondrially enriched liver extract (lane 3), total (unenriched) liver tissue (lane 4), and blood tissue 
(lane 5). Lanes 6, 7, and 8: Southern blot was prepared from the gel shown in lanes 3-5 and hybridized 
with a radioactivley labeled clone of the 5.5-kb mitochondrial HindIII fragment shown in lane 1 (as per 
lane 2) to produce this autoradiograph. Note that in addition to the 5.5-kb band of mitochondrial origin, 
a 3.6-kb band of presumed nuclear origin is seen. Its nuclear origin can be inferred from the fact that its 
intensity relative to the mitochondrial band increases with the increasing ratio of nuclear:mitochondrial 
DNA expected (moving from left to right across lanes 6 to 8). (Reproduction of Fig. 9 in Quinn and 
White, 1987.) 


dence for the presence of numts in several tapaculo (Scytalopus) species (cytochrome 
b; Arctander, 1995) and in several diving duck (Aythiyini) species (control region; 
M. D. Sorenson and R. C. Fleischer, personal communication). 


C. Numts as a Challenge for Studies 
of Avian Phylogenetics 


The presence of numt sequences presents a special challenge for PCR-based studies 
of the avian mitochondrial genome. This is in part because birds have nucleated red 
blood cells, a frequently sampled tissue that is rich in nuclear DNA and relatively 
poor in mtDNA. This in turn means that PCR amplification and sequencing of 
mtDNA targets in blood may be misleading in cases where a “viable” target also 
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exists in the nuclear genome. An illustration of this point is given by Quinn (1992). 
In an attempt to study the population genetics of snow geese across their range, 
PCR primers were designed to amplify the rapidly evolving 5' end of the control 
region. These and other primers were used to gather sequence information from 
two populations using, as a template, DNA that had been extracted either from 
blood samples from an eastern population or from liver samples from a western 
population. The initial results showed a major sequence difference between these 
two populations, with all of the eastern samples having an identical haplotype, and 
the western samples having a variety of haplotypes. However, faint "shadow" bands 
in sequences of some samples from the eastern locales often matched western se- 
quences at those nucleotide positions originally thought to differentiate the two 
populations. To investigate this further, DNA isolated from liver tissue (rather than 
blood) from some eastern samples was used as a template for PCR. The resultant 
sequences now appeared close or identical to haplotypes previously determined for 
western (liver) samples. Quinn (1992) concluded that predominantly nuclear se- 
quence was being obtained from blood, and mitochondrial sequence from liver 
(there was some variation in the relative intensities of "shadow" bands among 
samples). Oligonucleotide primers that preferentially matched either the mitochon- 
drial sequence or the nuclear sequence at the 3' end (as determined from the blood 
sample sequences) produced PCR products with sequences distinctly different from 
each other (Fig. 1.4). 

Quinn (1992) was eventually able to obtain mtDNA sequence from samples 
taken across the species range, but this was obviously a convoluted process. Had 
sequence only been obtained from a single blood sample for this species, rather than 
from two tissue sources from a number of different individuals, it is unlikely that its 
unexpected (nuclear) origin would have been detected. Hybridizing Southern blots 
of restriction endonuclease-digested genomic snow goose DNA with an entire 
mtDNA probe prepared from mouse generates extensive banding “ladders” (T. W. 
Quinn and B. N. White, unpublished), raising the possibility that this or another 
numt is arranged in a tandem repeat within the snow goose genome. Such a tandem 
arrangement was found in domestic cat (Lopez et al., 1994) and provides, numeri- 
cally, a more significant target for PCR primers. 

While the development of primers that can preferentially amplify the nuclear or 
the mitochondrial copy from a single sample is a useful tool (Quinn, 1992; Fig. 1.4), 
it also raises the worrying possibility that such preferential amplification of a numt 
could occur by chance rather than by design. If such events occur, even a single 
"clean" sequence would be no guarantee that the intended mitochondrial target has 
been amplified. Such inadvertent preferential amplification occurred during a study 
of akodontine rodents (Smith et al., 1992), but in this case the misamplification was 
detected, in part because of the unusual placement of stop codons and deletions 
within what was expected to be a protein-coding region. Lopez et al. (1994), in a 
study of domestic cats, noticed double bands in several positions along sequencing 
gels, and eventually showed them to be the result of simultaneous PCR amplifica- 
tion of nuclear and mitochondrial homologs of the same mitochondrial segment. 
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FIGURE 1.4 Use of specific PCR primers to differentially amplify mitochondrial or nuclear homolo- 
gous sequences. (Adapted from Fig. 3 in Quinn and Wilson, 1993.) (a) Primer placement relative to the 
mitochondrial control region. “Сіз” is similar in sequence to a portion of tRNA®" (E), and will anneal 
to either the nuclear or mitochondrial homolog. “М” is identical in sequence to the mitochondrial 
sequence, while “М” is identical to known nuclear sequence. Further details concerning primer design 
are provided by Quinn and Wilson (1993). F, tRNA". (b) Comparison of the sequences of two homo- 
log-specific primers [shown in the same orientation as in (a), written 3’ to 5’, left to right]. Dots indicate 
identity to the upper sequence. Note the mismatches concentrated at the 3' end. (c) Sequences obtained 
from the PCR products of a single DNA sample extracted from liver tissue. PCR primer pairs used in 
the PCR were Glu plus M (“M” lane) or Glu plus N ("N" lane). Arrows point to obvious sequence 
differences. Loading order, from left to right, was GATC. (Modified from Figs. 2 and 3 in Quinn, 1992 
with permission of Blackwell Science, Inc.) 


Van der Kuyl et al. (1995) and Collura and Stewart (1995) showed that nuclear 
homologs are present in a variety of primates, and that those found in humans can 
create special difficulties in the detection of contamination in ancient PCR experi- 
ments since they are imperfectly matched with known human mitochondrial se- 
quences. Thus, the implications of numts for other population and particularly phy- 
logenetic studies, where species are often represented by sequence from a single 
sample, may be profound, depending on their ubiquity. The evidence from humans 
is that such transfers are common, not rare, although whether they are frequently 
in high copy number as in domestic cat remains unknown. If numt sequences are 
gathered with the assumption that they represent mitochondrial sequences, para- 
logy will frequently be mistaken for orthology, possibly affecting phylogenetic con- 
clusions. Arctander (1995) provided evidence that unusual phylogenetic relation- 
ships among Scytalopus species, reported by Arctander and Fjeldsà (1994), were 
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likely the result of such a problem. Estimation of rates of change of mtDNA could 
also be altered by confusing orthology and paralogy, as it seems likely that numt 
sequences would evolve at the slower nuclear rate compared to mitochondrial se- 
quences (see the next section). 

One way to reduce the likelihood of such problems in modern avian samples 
would be to use purified mtDNA in PCR reactions. However, this would severely 
limit the efficiency of data collection, and furthermore, Collura and Stewart showed 
that in primate samples, contaminating nuclear sequences can still be amplified from 
such purifications. Another, more crude approach would be to use both blood and 
another tissue such as heart, liver, muscle, or feather in conjunction with PCR, and 
compare the sequencing gels from the two. In snow goose (Quinn, 1992) and in 
the canvasback (Aythya valisineria (M. D. Sorenson and К. C. Fleischer, personal 
communication), a comparison between sequences obtained from blood versus 
liver-extracted DNA shows a consistent difference, but scenarios in which this 
might not be the case can also be envisioned. For example, if the 3' end ofa primer 
matches the nuclear sequence more precisely than the mitochondrial target, a single 
clean nuclear sequence may result regardless of tissue type. The use of broadly over- 
lapping sequences generated with different sets of PCR primers might reveal such 
events. While a seemingly more elegant solution would be to use PCR to amplify 
the entire mitochondrial genome, the possibility of transposition of the entire ge- 
nome has not been discounted, and there may be some danger that "jumping PCR" 
could occur among fragmented mtDNA genomes (Pääbo et al., 1990). Nonetheless, 
such an approach may be worth exploring given its established feasibility (Cheng 
et al., 1994). 


D. Opportunities Provided by Numts 


The presence of numts can be problematic, but they also provide a unique oppor- 
tunity to better understand the differences in the mutational "spectra" of mitochon- 
drial versus nuclear DNA sequences. While such comparisons can be made by av- 
eraging the rates of change of nonhomologous genes in the two genomes (Brown 
et al., 1979; Vawter and Brown, 1986; also see Helm-Bychowski and Wilson, 1986), 
the most direct approach would be to place an identical sequence into the two 
environments and then to observe the changes that result. Numt sequences, in con- 
junction with their mitochondrial paralogs, constitute such an experiment. Arc- 
tander (1995) compared DNA sequence information of a mitochondrial and a 
presumptive nuclear cytochrome 6 pseudogene among eight Scytalopus species to 
estimate that the mitochondrial genome was evolving at least 13.6 times faster over- 
all, and 39 times faster at third positions of codons than the nuclear pseudogene. 
Numts have also been used as a source of outgroup sequences for intraspecific com- 
parisons of mtDNA sequence (Quinn, 1992; Zischler et al., 1995a), although cau- 
tion must be exercised since the possibility for gene conversion between mitochon- 
drial and nuclear sequences has not been discounted. 
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VII. CONCLUSION 


The impressive advances in our understanding of avian evolution that have resulted 
from studies based on the avian mitochondrial genome become obvious by inspect- 
ing various chapters in this book, and the published literature in general. While we 
still have much to learn about the genome and its evolution, our considerable 
knowledge is allowing increasingly sophisticated and reasoned use of its sequences 
for evolutionary inference at all levels. Continuing technological advances have now 
made publication of complete mitochondrial genomes a semiannual event, provid- 
ing even more exciting prospects for evolutionary comparison. 

This book is being published at a time when molecular evolutionary biologists 
are increasingly turning and, in some cases, returning to DNA sequence-based stud- 
ies of the nuclear genome, also reflected in some of the following chapters. In birds 
that genome is more than 75,000 times larger than the mitochondrial genome. It is 
also much more complex. However, because of a different type of inheritance pat- 
tern, a different mutational environment, and distribution of its genes onto numer- 
ous unlinked chromosomes, it promises to bring with it many new insights (i.e., 
Kornegay et al., 1993). Numt sequences may facilitate comparison and "cross- 
calibration" of the two genomes, as they provide an empirical view of what happens 
to two pieces of DNA that share a common ancestral sequence, but that have sub- 
sequently been placed within two different organelles, originating from separate 
kingdoms. Ultimately, information from the two genomes, regardless of the chro- 
mosome/gene/region being studied, will provide information that is synergistic in 
nature. Perhaps it has been stated too many times, but the technological advances 
of the last 15—20 years have clearly made this one of the most exciting and produc- 
tive periods in which to study evolutionary biology. 
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I. INTRODUCTION 


Understanding evolutionary processes at one level of organization often requires 
addressing processes at higher or lower levels of organization. For example, ques- 
tions at the family (kinship) level may raise questions about genetic differences 
among populations. Even when a particular question does not require answers from 
other levels, it may raise intriguing possibilities about consequences for processes at 
those other levels. Given limited resources, evolutionary biologists should benefit 
from tools that allow one to address problems at several levels, using a single data 
source and a single technical tool. 

DNA microsatellites are genetic markers that can be useful in addressing ques- 
tions at a variety of scales, ranging from the extremely fine grained to the fairly 
coarse grained. More specifically, this genetic tool can help solve problems ranging 
from individual-specific, such as determining gender (Longmire et al., 1993; Dele- 
hanty, 1995), to questions of relatedness and parentage (Amos et al., 1993; McDon- 
ald and Potts, 1994; Kellogg et al., 1995; Primmer et al., 1995), the genetic structure 
of populations (Bowcock et al., 1994; Taylor et al., 1994; Dallas et al., 1995; Estoup 
et al., 1995; Paetkau and Strobeck, 1995; Gibbs et al., 1996), and up to comparisons 
among species (Roy et al., 1994b; Garza et al., 1995). Further, it has several technical 
and analytical advantages that make it superior to genetic markers whose domains 
are far smaller. It therefore comes close to meriting the rubric “master of all trades.” 
We present an overview of the technique, our assessment of the sorts of problems 
to which it is well suited, and provide examples from the literature and our own 
research that exemplify some ofthe scales at which microsatellites can provide useful 
answers. Because several reviews ofthe technique already exist (Bruford and Wayne, 
1993; Schlotterer and Pemberton, 1994; Westneat and Webster, 1994), we strive to 
cover the essentials with a minimum of overlap. A review of molecular techniques 
in zoology (Fleischer, 1996) provides an overview of the role of other markers such 
as minisatellites and mitochondrial DNA, as well as microsatellites. 

We caution against application of inappropriate models to the data. We also stress 
that careful attention to model assumptions is required; microsatellite data sets may 
not always meet model assumptions concerning, for example, the balance between 
drift, mutation, and migration. We provide case histories to illustrate some of the 
potential pitfalls. We hope to stimulate interest in the development of more sophis- 
ticated models and empirical tests of assumptions, so that the analytical techniques 
become maximally consistent with actual patterns observed in natural populations 


of birds. 


II. TECHNICAL OVERVIEW 


The genomes of most eukaryotes contain thousands of loci containing numerous 
tandem repeats of short nucleotide sequence motifs (Tautz et al., 1986), such as 
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(CT/GA),, where n is the number of repeats (useful range from approximately 8 
to 30). Tandem repeat loci are hypervariable owing to high mutation rates (10 ^? to 
10 5) that either increase or decrease the number of repeat units (Wright, 1994). 
We discuss the mutation process below. Tandem repeat loci can be categorized 
arbitrarily into two groups оп the basis of the size of the repeat unit; loci containing 
shorter repeat units (usually two to six base pairs) are called microsatellites [also 
referred to as simple sequence length polymorphisms (SSLPs), simple sequence 
repeats (SSRs), or short tandem repeats (STRs)] and loci with larger repeat units 
are called minisatellites (the loci used in the original form of DNA fingerprinting) 
(Jeffreys et al., 1985). We suggest that the term “microsatellites” is preferable 
to “SSR” or “STR,” in part because electronic literature searches return much 
extraneous material when using other terms (and particularly acronyms). The 
inclusive term for both micro- and minisatellites is VNTR (variable number tandem 
repeats). 

VNTR alleles differ in length (number of repeats), and are therefore easier to 
identify than markers differing only by sequence; they can be readily discriminated 
on the basis of their differential electrophoretic mobility. Amplification with the 
polymerase chain reaction (PCR) avoids the more laborious alternative procedure 
of Southern blotting, while allowing precise identification of all allelic length vari- 
ants on polyacrylamide gels. PCR amplification is feasible for virtually all microsat- 
ellite loci, but few minisatellites, because the size of the minisatellite tandem repeat 
usually exceeds PCR limitations. The advantages afforded by PCR amplification are 
one reason we feel microsatellites will increasingly dominate the field of genetic 
markers for ecological and evolutionary studies. A second reason is that the muta- 
tion rate of some minisatellite loci is so high (107° to 10 7?) (Jeffreys et al., 1988) 
that even locus-specific minisatellites may be inappropriate for analyses at the scale 
of populations or higher. The mutation rate of microsatellite loci is estimated to 
range between 10 ^? and 10 ^? (Edwards et al., 1992; Hearne et al., 1992; Weissen- 
bach et al., 1992). Heterozygosity appears to be greatest for microsatellites with 
approximately 20 repeat units, although these larger repeats constitute only a small 
fraction of the total repertoire of loci with 26 repeats (Ellegren et al., 1995). 

The ease of generating and analyzing data for microsatellite loci is offset by the 
need to develop loci in new species. If one is lucky, microsatellite loci will already 
have been developed for the species of interest, or a related species. Such is the case 
for humans, and laboratory and domesticated animals and plants, where hundreds 
of microsatellite loci have been developed for many species, primarily for use in 
mapping genetic traits of scientific or economic interest. The primary reason that 
PCR primers developed for a microsatellite locus in one species will not work in a 
related species is that mutations in the flanking regions may prevent adequate an- 
nealing of the primers. The probability of this flanking region mismatch is of course 
related to the time since divergence of the taxa involved. In our experience, most 
primers will work for congeneric species and quite surprisingly a useful proportion 
of primers work between species in different families as documented in songbirds 
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(Hanotte et al., 1994; McDonald and Potts, 1994; Primmer et al., 1996), marine 
turtles (FitzSimmons ef al., 1995), and whales (Schlotterer et al., 1991). It is likely 
that a series of primers for well-studied taxa such as birds will be available within 
the decade, such that screening the series will provide sufficient loci for most proj- 
ects. Until then, most investigators will have to develop their own microsatel- 
lite loci. 

Developing microsatellite loci for a new species requires that one construct a 
genomic library, screen the library for clones bearing one or more tandem repeats, 
sequence the clones, and develop PCR primers that amplify the tandem repeat. 
Each of these steps is relatively routine and one should be able to develop numerous 
microsatellite loci within a few months. Nevertheless, numerous potential pitfalls 
exist for the first-time user. The basic techniques have been published (Tautz, 1989; 
Ashley and Dow, 1994; Schlotterer and Pemberton, 1994; see the useful summary 
by Queller et al., 1993). A publication by Strassmann et al. (1996) and a manual 
available on the Internet (see Appendix I) include complete protocols. Here we 
provide a brief overview of the entire procedure, including a review of relatively 
recent techniques for enriching the library for microsatellite loci. 


A. Generating a Size-Selected 
Plasmid Genomic Library 


A genomic library is a collection of DNA segments (clones) from the species of 
interest, inserted in a microbial vector. This library can be screened with a labeled 
probe of known sequence to select clones containing the same or similar sequences. 
This is the way a clone from one species can be used to clone the same gene in 
related species. In our case we want to clone loci bearing microsatellite repeats, so 
the library is probed with labeled DNA containing these repeats. 

Library construction entails digesting both the plasmid vector and the genomic 
DNA with restriction enzymes that leave compatible ends for ligating genomic 
fragments into the vector. An often used combination is ЅаиША for digestion of 
genomic DNA and BamHI for digestion of the plasmid. Since sequence information 
immediately flanking the repeat is needed for developing PCR primers, it is useful 
to clone fragments that are small enough [<600 base pairs (bp)] to be sequenced in 
a single sequencing run. At the very least, one wants to ensure that there will be no 
gaps when sequencing from each end with the forward and reverse plasmid se- 
quencing primers. To obtain fragments of the appropriate size, digested genomic 
DNA is run on an agarose gel and a gel slice containing fragment sizes between 300 
and 600 bp is removed. Fragments smaller than 300 bp are avoided to reduce the 
probability that the repeat will be so close to one end of the fragment that primer 
development is impossible. The size-selected genomic DNA is purified and ligated 
into the plasmid vector. The ligation mixture is transformed into competent cells. 
The resulting plasmid library is ready for screening or storage. 
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B. Library Enrichment Techniques 


Because fewer than 1 in 1000 clones will contain any given class of microsatellite, 
successful identification of multiple clones can be laborious and difficult. To speed 
the process, techniques exist for enriching the library for microsatellite-bearing 
clones. These techniques fall into three general categories: hybridization selection 
(Armour et al., 1994; Kandpal et al., 1994; Fleischer, 1996), primer amplification 
prior to cloning (Ostrander et al., 1992), and triplex affinity capture (Nishikawa 
et al., 1995). These enrichment procedures reduce the number of clones that need 
to be screened. Reduced screening can also be achieved by using libraries with large 
inserts, but the repeat may then be too far away from the cloning site to use vector 
primers to obtain flanking sequence. One can attack the distant flanking region 
problem, in turn, by using the repeat as a primer for sequencing the flanking region 
(Yuille et al., 1991; Baron et al., 1992; Koref et al., 1993; Hawkins et al., 1994; Rowe 
et al., 1994). 

In addition to library enrichment, a number of techniques exist for amplifying 
anonymous microsatellite loci, obviating the need for cloning (Charlieu et al., 1992; 
Laurent et al., 1994; Wu et al., 1994; Zietkiewicz et al., 1994). 


C. Screening the Library for 
Microsatellite-Bearing Clones 


Which microsatellite repeats should be used for screening? The choice is a tradeoff. 
Dinucleotide repeats are more numerous (Tautz et al., 1986; Shriver et al., 1993), 
and therefore easier to find, but the larger tri- and tetranucleotide repeats are easier 
to genotype because alleles have larger size differences. The efficiency ofthe screen- 
ing process can be increased by probing with multiple repeats at once, which must 
be grouped according to similar melting temperatures. Clones are plated onto large 
agar plates (132 mm) to a density of about 1000 clones and replicates are made on 
filters. These filters are then screened with radiolabeled (or nonisotopically labeled) 
oligonucleotides containing the desired repeat(s). Positive clones are picked and 
usually subjected to a second round of screening to confirm positives. These positive 
clones are then sequenced to confirm the presence of a repeat and to obtain flanking 
sequence information for development of primers. The lower the repeat number, 
the less likely it is that the locus will be polymorphic. Accordingly, only clones 
containing eight or more repeats are usually developed. Commercial programs exist 
for developing primers (see also PRIMER, in Appendix I). 


D. Genotyping 


Alleles are resolved by acrylamide gel electrophoresis and usually visualized by ra- 
diolabeling the PCR products and exposing X-ray film to the acrylamide gel. We 
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have described a technique for staining the acrylamide gel with ethidium bromide, 
allowing visualization of the DNA under ultraviolet (UV) light (Potts, 1996). By 
using ethidium bromide staining (as we do) a microsatellite laboratory can operate 
with little or no need for the extra facilities required for isotope use. 

The entire PCR and gel electrophoresis process takes only 6 to 8 hr, one can run 
two to five 60-lane gels per day, and produce 120 to 300 genotypes per day. It is this 
high-efficiency genotyping that makes the somewhat laborious development of mi- 
crosatellite loci worthwhile. Equipment for automating the genotyping of micro- 
satellites has recently become commercially available (e.g., Applied Biosystems 
(Foster City, CA), Lycor, Pharmacia (Piscataway, NJ)], further simplifying the 
process. 


E. Microsatellite Mutations 


Tbe mutational mechanism causing microsatellite loci to be hypervariable is not 
completely understood. It appears to be caused by a process referred to as "strand 
slippage” that occurs primarily during DNA replication (Schlotterer and Tautz, 
1992). A replicating DNA strand can slip one or more repeat units within a repeat 
and resume perfect base pairing. The resulting bulge can then be repaired, resulting 
in the addition or deletion of the nonpaired bases. There is evidence that the addi- 
tion or deletion usually involves a single repeat unit, with some evidence for rarer 
events of larger effect (Garza et al., 1995). Weber and Wong (1993) described 24 
microsatellite mutations occurring in the CEPH families; 20 involved changes of a 
single repeat unit and the remaining four changed two repeat units. This type of 
stepwise mutational process means that allelic variants of similar sizes are more 
closely related; the stepwise process can therefore provide additional information 
for phylogenetic reconstruction, as we develop below. 


ПІ. THE RANGE OF SCALES 


In the following sections we outline the range of scales that we see as appropriate 
when using microsatellites as genetic markers (Table I). 


A. Among Genes 


Microsatellites can serve as genetic markers at the level of genes in at least two ways. 
One is in gender determination (Delehanty, 1995; Longmire et al., 1993), and the 
other is in mapping genetic traits. Longmire et al. (1993) used microsatellite repeat 
sequences [e.g., (GT),] as probes, without the work of sequencing the flanking 
regions to develop primers. As more microsatellite flanking region sequences are 
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TABLEI Range of Scales at Which Microsatellites Should Prove Useful 





Level of organization Subject of inquiry Potential problems 





Among genes Sex determination Need locus on sex 
chromosome 


Gene mapping Linkage 
Among individuals Parentage Null alleles, intergenerational 
mutations 
Relatedness Null alleles, lack of pedigrees 
Among populations Population subdivision Variable mutation rates, vari- 


able importance of drift 
Phylogeography Homoplasy, high mutation rate, 
nonlinear divergence 


Among species Fine-grained phylogeny Homoplasy, high mutation rate, 
nonlinear divergence 
Among higher taxa Phylogenetics by gene Homoplasy, nonlinearity 
arrangement 
Temporal (0-300+ years bp) Small, degraded DNA seg- 
ments, contamination 
Temporal changes in genetic 
variation 
Study of extinct/endangered 
populations 





published, single-locus probes from the sex-determining chromosome (W in birds) 
should become available for many species. Because females are the heterogametic 
sex in birds (WZ), a microsatellite locus on the W chromosome will serve as an 
unequivocal marker for females. Birds showing a band must be females, while those 
lacking a band could be either males or marker failures. A microsatellite marker 
from the Z chromosome will be informative only for heterozygous males. 

Microsatellites are now the primary tool for mapping the human and mouse 
genomes (Dietrich et al., 1996). They are used as markers linked to loci of interest, 
such as the "obesity gene" in mice (Zhang et al., 1994). Microsatellites may even- 
tually be important for mapping genes and gene arrangements in natural popula- 
tions of birds. Current gene-mapping projects are largely restricted to studies in 
humans, using sibling analysis, and in laboratory mice, using inbred strains. For 
birds, the first applications will probably come from poultry (Cheng and Critten- 
den, 1994; Burt et al., 1995), or possibly from natural populations for which exten- 
sive pedigrees are available. Microsatellites developed to map economically impor- 
tant domesticated species may then work in very closely related species for which 
the pattern of linkage between the microsatellite markers and the genetic traits is 
similar. The potential to use extensive pedigrees in birds showing genetic mo- 
nogamy points to an unforeseen advantage of long-term field studies. 
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B. Among Individuals 


Offspring are the most obvious component of fitness, and determining parentage is 
therefore of considerable interest to evolutionary biologists. Allozymes and multi- 
locus minisatellite probes were an important tool in revising much of the conven- 
tional wisdom concerning avian mating systems. In many species, it became evident 
that superficial behavioral monogamy did not correspond to patterns of genetic 
parentage. In some behaviorally monogamous species, extrapair fertilizations (EPFs) 
and mixed paternity were found to be common [Westneat (1987), 29% of young 
by EPF; Stutchbury et al., 1994]. In several polygynous species, such genetic mark- 
ers have shown that females may rarely mate with more than a single male in a 
season [Hartley et al. (1993), 4.5% EPF; Hasselquist et al. (1995), 3.1% EPF], al- 
though the most successful males will have many mates. 

The availability of numerous, highly polymorphic markers makes microsatellites 
a good choice for assessing parentage (Chakraborty et al., 1988). Nevertheless, as is 
true of any genetic marker, microsatellites are not a panacea. As Strassmann et dl. 
(1996) point out, even an error rate approaching 0.001 in scoring can produce a 
nonnegligible incidence of false parentage exclusions. 

Relatedness among individuals is a cornerstone of much of the theory of behav- 
ioral ecology, because of the contribution of relatives to inclusive fitness. Queller 
et al. (1993) and Westneat and Webster (1994) provided useful overviews of the 
application of microsatellite markers to questions of relatedness among individuals. 
Blouin et al. (1996) used microsatellites to assess relatedness among mice (Mus mus- 
culus). In the section of case histories, we provide an example from our study of 
long-tailed manakins, Chiroxiphia linearis (McDonald and Potts, 1994). 


C. Among Populations 


Michod (1980) argued that the hard core of the modern synthesis was the devel- 
opment of the "beanbag" population genetics theory by Wright (1969; 1978), 
Fisher (1958), and Haldane (1966). Most of the work by these pioneers dealt with 
theory based on gene frequencies. The development of allozyme electrophoresis 
provided a laboratory method for assessing such gene frequencies in natural popu- 
lations. Microsatellites share with allozymes the advantage of being locus specific, 
and of being inherited in Mendelian fashion. They therefore appear well suited to 
analysis with the full repertoire of models developed to analyze allozyme data. These 
include measures such as F statistics (Wright, 1978; Weir, 1996) and various genetic 
distance measures (Rogers, 1972; Nei, 1978; Reynolds et al., 1983). Microsatellites, 
except those strongly linked to coding regions under selection, seem more likely to 
meet the assumption of neutrality that has been questioned for some allozyme 
analyses (Karl and Avise, 1992). The sampling necessary to examine variation 
among populations may entail little extra field and laboratory effort beyond that 
already entailed by a within-population study. 
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As noted above, evidence suggests that microsatellites mutate in stepwise fashion, 
such that similarity in repeat number indicates recent common ancestry of alleles. 
Slatkin (1995a,b), Goldstein et al. (19952), Shriver et al. (1995), and Michalakis and 
Excoffier (1996) used this inherently phylogenetic information to generate genetic 
distance measures. In this respect microsatellite variation may thus resemble some 
phylogenetic applications of mitochondrial DNA (mtDNA). Such "phylogeo- 
graphic" studies based on mtDNA combine a fine-grained phylogenetic approach 
with the study of geographic genetic variation (Ball and Avise, 1992; Ball et al., 
1988). Microsatellites in humans provide a much clearer view of such geographic 
pattern in human populations than does mtDNA (Bowcock et al., 1994). We envi- 
sion considerable theoretical attention to the special problems and prospects af- 
forded by microsatellite data as applied to population subdivision; the application of 
measures of population subdivision to microsatellite data will not, however, be 
completely straightforward. 

A few microsatellite studies of nonavian animals have addressed issues of model 
assumptions and their suitability of the models for microsatellite data sets. Estoup 
et al. (1995) addressed the problem of stepwise mutation versus infinite allele models 
of the mutation process. Their conclusion that stepwise models were not signifi- 
cantly better may be due, in part, to their using compound repeats comprising two 
or three different length motifs rather than perfect tandems. Such compound mi- 
crosatellites seem unlikely to be a major element of microsatellite analyses in birds, 
because screening will usually be confined to perfect repeats. In a population-level 
study that we will explore in more detail as a case history, Allen et al. (1995) dis- 
cussed the validity of a number of assumptions concerning the mutation process and 
the balance between mutation and drift. 

A number of studies have used microsatellites to assess aspects of gene flow 
among populations. Gibbs et al. (1996) used microsatellite loci to examine variation 
among host races of the common cuckoo (Cuculus canorus). They showed that host 
specialization is not mirrored by genetic structuring in microsatellite loci or the 
mtDNA control region. Their results point to several avenues for further research 
on the way in which the parasite matches its host's egg color. Paetkau and Strobeck 
(1995) assessed population structure in polar bears (Ursus maritimus), and were able 
to show that despite long-distance seasonal movements, local populations main- 
tained distinct genetic profiles and that patterns of gene flow may not be obvious 
from geographical proximity alone. Dallas et al. (1995) used microsatellites to show 
that subpopulations of mice (M. musculus) show considerable connectivity that may 
be due to migration or recurrent founder events from diverse source stocks. 

A potentially important application of microsatellites is for conservation studies. 
For birds, an example would be characterizing neotropical migrants in order to link 
information on wintering, stopover, and breeding habitats. The study of apparent 
declines in neotropical migrants (Robinson et al., 1995; but see James et al., 1992) 
requires that populations be followed in all their habitats. Unfortunately, it has rarely 
been possible to trace migrant populations across habitats between seasons. The 
high degree of detail available in microsatellite data provides a potentially powerful 
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way to assign wintering individuals to a breeding population or vice versa, as Wen- 
ink et al. (1993) did for dunlin (Calidris alpina), using mtDNA. Because PCR am- 
plification allows analysis from minute quantities of DNA, microsatellite analyses 
are also feasible with noninvasive sampling such as the use of shed hair in endan- 
gered primates (Morin et al., 1994). Houlden et al. (1996) used microsatellites to 
examine the effects of population bottlenecking in koalas (Phascolarctos cinereus) that 
underwent near-extinction population crashes. 


D. Among Species 


Microsatellite primers will often work across avian congeners or even across an en- 
tire family (Hanotte et al., 1994; McDonald and Potts, 1994; Primmer et al., 1996). 
In such cases, microsatellites may be useful in assessing relationships among species. 

Microsatellite data can provide useful evidence for conservation decisions at or 
above the species level. Roy et al. (19942) examined variation and hybridization 
among wolflike canids. They concluded that the red wolf (Canis rufus) is clearly a 
hybrid between the gray wolf (Canis lupus) and the coyote (Canis latrans). Exhaus- 
tive or expensive efforts to preserve this form may, therefore, be less appropriate 
than efforts to preserve truly distinct forms (Garcia-Moreno et al., 1996) such as the 
Mexican wolf (Canis lupus baileyi). 

Our case histories (Section V) include the study of Forbes et al. (1995) compar- 
ing levels of genetic variation in domestic sheep (Ovis aries) with those of Rocky 
Mountain bighorn sheep (Ovis canadensis). 


E. Among Genera and Higher Taxa 


Direct examination of microsatellite distance measures among genera and higher 
taxa will probably be less fruitful than studies at lower taxonomic levels. The prob- 
lems (see Section IV) of homoplasy, constraints on repeat number, nonlinear diver- 
gence, and high mutation rate are likely to make such analyses unrewarding in 
comparison to those using other molecular markers more suited to these coarse- 
grained analyses. Microsatellites may, however, serve a key role as indirect markers 
for major gene rearrangements. When avian gene maps are eventually developed, 
avian biologists will be able to compare genomic organization at intergeneric and 
higher levels. Such gene rearrangements will be rare events that should characterize 
deep phylogenetic branching points [see T. Quinn (Chapter 1 in this volume) for 
an application of this method using mtDNA]. That is, the gene rearrangements 
would function as synapomorphies unifying groups of taxa. As noted above, micro- 
satellites are a key marker for mapping genes. 
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F. Temporal Scales (Ancient DNA) 


Thus far, we have addressed scale in terms of level of biological organization. Mi- 
crosatellites also provide a powerful opportunity to examine problems across tem- 
poral scales. Because the primer pairs are short, and the microsatellites can be am- 
plified by PCR, even minute quantities of ancient, degraded DNA can be analyzed 
(Ellegren, 1991; Roy et al., 1994a; Taylor et al., 1994). Genotyping of museum 
specimens permits assessment of variation among populations that are no longer 
extant, as well as among specimens from extant populations sampled one hundred 
or more years ago. 


IV. CAUTIONS FOR DATA ANALYSIS 


Because others (Bruford and Wayne, 1993; Schlotterer and Pemberton, 1994; 
Westneat and Webster, 1994) compared microsatellites to a variety of other genetic 
markers, we concentrate here on a few comparisons not covered in those sources, 
focusing on caveats particular to microsatellites. 

Differences in mutation processes and rates between microsatellites and allo- 
zymes provide both opportunities and problems for analysis of population subdivi- 
sion. In allozyme analyses, new alleles arise by point mutations, and allelic variation 
has usually been modeled under the assumption of infinite alleles. For microsatel- 
lites, the stepwise mutation process whereby repeats are usually added or subtracted 
one at a time suggests that similarity of length reflects allelic relatedness (Slatkin, 
1995a,b). Theoretical models take advantage of the stepwise process to develop 
measures of genetic distance unique to microsatellites (Goldstein et al., 1995a,b; 
Shriver et al., 1995; Slatkin, 1995a,b; Michalakis and Excoffier, 1996). Because both 
additions and deletions are possible, however, microsatellites will exhibit length ho- 
moplasy (variants with the same overall length that are not identical by descent). 
Even sequencing rather than electrophoretic measurement of repeat number cannot 
reveal homoplasy due to addition and subsequent deletion of identical repeat units 
[e.g., one lineage that changes from (CA),; to (СА), and then back to (CA),;, 
while another lineage remains unchanged]. What sequencing can do, however, is 
reveal more complex patterns of length homoplasy resulting from nonidentical se- 
quences with the same overall repeat size (Garza et al., 1995). In some cases, such 
length homoplasies will derive from variation in the flanking regions, and may in- 
volve point mutations as well as, or instead of, the more frequent slippage events 
that presumably drive the stepwise process. Because of the high slippage mutation 
rate of microsatellites, a given time period will involve more divergence of the re- 
peat units than that expected from point mutations in the flanking regions. As a 
result, the temporal scale over which microsatellites provide sufficient resolution 
may be narrower than that of markers that mutate more slowly, such as allozymes. 
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A further complication in analyzing microsatellite data is the presence of unam- 
plified "null alleles" (Callen et al., 1993; Paetkau and Strobeck, 1994; Pemberton 
et al., 1995). One cause of null alleles is point mutations in the flanking region that 
prevent PCR amplification. As a result the single, amplified band from an individual 
may make it appear to be a homozygote, despite an unamplified, null allele that 
differs from the amplified band. This may lead to an apparent deficit of heterozy- 
gotes compared to Hardy- Weinberg expectation, and may mask divergence among 
populations that contain variants sufficiently different (via point mutations in the 
flanking region) to prevent amplification. Because of their apparent origin as point 
mutations in flanking regions, null alleles will tend to be particularly prevalent in 
applications of primers to species other than those in which they were designed. 
Brookfield (1996) describes a method for assessing the frequency of null alleles. Null 
alleles are not the only possible cause of heterozygote deficiency or excess, however. 
For example, Wahlund’s principle can produce heterozygote deficiency when 
samples from divergent populations are pooled. Weir's (1996) book provides a useful 
compendium of several classic and newer methods of data analysis applicable to 
microsatellites. 

A critical point in the application of any model to data is the degree to which 
the model's assumptions are violated in nature. Relatively little work has been done 
in this area (Scribner et al., 1994). Different models make different assumptions 
concerning the relative roles of drift and mutation. Slatkin (1985) distinguished 
models based on the equations for differentiation in an n-island model. One such 
equation is 


(fo — f)/ р) = 1/(4Nma + 1) (1) 


where f = [fo + (n — 1)fi]/n is the average probability of identity of two alleles 
drawn at random from the population, f, and f; are the probabilities of identity of 
two alleles drawn randomly from the same and two different subpopulations, re- 
spectively, М is the population size, т is the migration rate, and а = [n(n — 1)]?. 
Equation (1) leads to measures such as Fst. Although the mutation term, и, is men- 
tioned in some formulations of Fsr (Wright, 1969) where 


Fo = 1/[4N(m + u) — 1] (2) 


the u is omitted in most or all implementations (computer programs), because the 
programs were developed for allozyme data, with mutation as a negligible force. An 
alternative formulation is 


Л = AL (n = Dm] 3) 


which leads to measures such as Ме1% genetic distance, D. Slatkin (1985) empha- 
sized that the major distinction between the formulations is their relative emphasis 
on either drift [Eqs. (1) and (2); measures such as F4] or mutation [Eq. (3); measures 
such as D]. 

In most of the size-based, stepwise models developed for microsatellites, drift is 
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considered to be negligible relative to the effect of relatively high mutation rates 
characteristic of microsatellites (10 ? to 10 >). Depending on the particular history 
of populations, and the nature and mutation rate of the microsatellite loci analyzed, 
one can imagine different populations having a full spectrum of balances between 
the effects of drift and mutation. We provide an example of evidence for unmet 
assumptions in our case history of work by Allen et al. (1995) on gray seals (Halicho- 


erus grypus). 


V. CASE HISTORIES 


A. Relatedness among Partners 
in Male Long-Tailed Manakins 


Long-tailed manakins (C. linearis) are fruit-eating neotropical birds with a lek mat- 
ing system. The essential defining characteristics of lek mating systems are that males 
provide no material benefits to breeding females (such as nest sites, feeding territo- 
ries, or paternal care), and that females exercise some degree of choice among males, 
based on courtship displays or ornaments. Examples of lek-mating species include 
several grouse, most manakins, several birds of paradise, and two species of shore- 
birds [ruffs (Philomachus pugnax) and the buff-breasted sandpiper (Tryngites subrufi- 
collis)]. The five species of manakins in the genus Chiroxiphia have an unusual twist 
on the usually intense competition among males in lek-mating systems. Males co- 
operate for courtship display. In long-tailed manakins, two partnered males perform 
a unison song that resembles the word toledo (Trainer and McDonald, 1993; Trainer 
and McDonald, 1995). If the males persist in singing for several months or seasons, 
they may receive a female visit. Once a female arrives for a visit, the males move to 
a low dance perch for a dual-male backward leapfrog dance display. During the 
dance display, they alternate sets of as many as 100 backward leapfrog jumps with 
labored "butterfly" flight (McDonald, 19893). Surprisingly, during the course of 
partnerships that may last as long as 8 years, only one of the two males copulates 
(McDonald, 1989b). It is always the alpha (dominant) male of the pair. A central 
problem ofthe senior author's long-term research on long-tailed manakins in Costa 
Rica has been to understand why the beta (subordinate) male cooperates in this 
extended courtship sequence. 

A potential explanation for the cooperation between males lies in the theory of 
inclusive fitness (Hamilton, 1964). If partners were close relatives, a beta male that 
helped his partner produce more copies of their shared genes might more than offset 
the cost of not transmitting his own genes directly. This hypothesis predicts that 
partners should be close relatives. We used microsatellite DNA to assess relatedness 
among males. We developed four polymorphic loci with a mean heterozygosity of 
0.42 and a mean of 2.8 alleles per locus. No allele had a frequency higher than 0.81. 
A locus is more informative (Queller and Goodnight, 1989) when it has multiple 
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alleles of comparable frequency (e.g., four alleles, each at 0.25, rather than two 
alleles at 0.95 and 0.05). The four loci provided resolution sufficient to assess relat- 
edness among partners as compared to the average, background level of relatedness 
among males in the local population. We found that partners were no more closely 
related (r = —0.14) than males picked at random from the local population. Indirect 
inclusive fitness (kin selection) cannot, therefore, be invoked as a complete or partial 
explanation for cooperation in this species (McDonald and Potts, 1994). The mi- 
crosatellite data provided a powerful tool for rejecting an inherently plausible hy- 
pothesis for explaining cooperative behavior. 


B. Gray Seals 


Allen et al. (1995) estimated gene flow in gray seals (H. grypus) at two colonies 
separated by approximately 500 km. The estimates from different measures differed 
widely. Using Fs, Nm was 41; with Slatkin's (1995a,b) Rsr it was 13.8, and with 
Slatkin's private allele method it was 5.6. As they pointed out, the difference clearly 
suggests that the microsatellite data contravene one or more assumptions of the 
different models. They suggested that the low Fs; (high Nm) might result from the 
possibility of back mutations returning to previous states, contravening the assump- 
tion of an infinite number of potential alleles. They also assessed the effect of in- 
cluding the mutation term, и, as іп Eqs. (1) and (3), which is usually omitted from 
implementations. For a mutation rate as high as 10^? (many dinucleotide repeats 
will have lower mutation rates), however, inclusion of the mutation term lowered 
Nm only slightly (from 41 to 40.8). 


C. Domestic and Bighorn Sheep 


Forbes et al. (1995) examined microsatellite variation within and between domestic 
sheep (O. aries) and Rocky Mountain bighorn sheep (O. canadensis). Their major 
conclusion was that "classic" methods such as F5; or Nei's D differed considerably 
from allele size-based methods such as those of Goldstein et al. (19953) and Slatkin 
(1995a,b). The classic methods were more sensitive to population differences within 
species, while the size-based measures yielded results more consistent with other 
biogeographical and genetic analyses. Secondarily, they pointed to the problem of 
interpretation when markers developed in one species are applied to another. “Dif- 
ferences in allele size and polymorphism among taxa may be explained by bias in 
the cloning and characterizing [of] microsatellite loci (FitzSimmons et al., 1995; 
Pepin et al., 1995).” As pointed out by Ellegren et al. (1995), screening selects for 
repeats that are longer, and more polymorphic than the average. When the primers 
are applied to other taxa, the repeat length will tend to be shorter and the locus less 
polymorphic. 

In some cases, it will be difficult to know a priori whether assumptions are met. 
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For example, one may lack the biogeographic and demographic data necessary for 
assessing assumptions concerning the relative importance of mutation and drift. In 
such cases it may be instructive to compare measures such as the stepwise models to 
measures such as F statistics at several hierarchical scales in order to assess the relative 
applicability of the different models (Slatkin, 1995a,b). Microsatellites are. not 
simply glorified allozymes, and new methods will be required to deal with all the 
ramifications of mutation process and rate that they entail. 


VI. SUMMARY AND CONCLUSIONS 


The rise of PCR-based microsatellite markers provides abundant opportunities, but 
also calls for caution. Because of their suitability for addressing problems at a fairly 
wide range of levels of organization from within individuals to among populations 
or species, we foresee a considerable expansion of interest in the interactions across 
these scales. For example, how do social systems and fine-scale demography affect 
population subdivision? Which factor generally has the greatest impact on popula- 
tion subdivision in birds— sexual selection, with its potential effects on effective 
population size through the sex with the higher variance in reproductive success, or 
the sedentary habit (low dispersal distance) of many cooperative breeders? The re- 
surgence of population-level studies that should follow in the wake of widespread 
development of microsatellite loci in a variety of taxa calls for development of well- 
documented computer software packages with which to analyze the resulting data. 

The need for careful assessment of the fit between the demographic parameters 
of the population analyzed (migration, drift, mutation rate) and the assumptions of 
the models used to analyze it cannot be overemphasized. Application of an inappro- 
priate model can lead to the wrong conclusions about the patterns and processes of 
genetic differentiation. We foresee a considerable need for theoretical work to ex- 
plore the fit of conventional measures of population subdivision (e.g., F statistics) 
to microsatellite data sets, as well as continued exploration of the ramifications of 
the nature of the mutation process for genetic models and analyses. 
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AppendixI Internet and Other Resources for Microsatellite Analyses? 





Program name 


Task 





Primer development 
GenBank 
PRIMER 

Laboratory protocols 


MsatManV6 
[Book chapter] 
Genetic analysis 
(e.g., Fst) 
Relatedness 4.2b 


GENEPOP 


GDA 


MSAT 


WINAMOVA 


List of sequences 


Primer design 


Detailed protocols 


Detailed protocols 


r (coefficient of 
relatedness) 


Various population genet- 
ics analyses 


Various population genet- 
ics analyses 


Stepwise mutation dis- 
tance measures 


Stepwise mutation dis- 
tance measures 


Ref. 


na 


na 


na 


Srassmann et al. (1996) 


Queller and Good- 
night (1989) 


Raymond and Rousset 
(1995) 


Weir (1996) 


Goldstein et al. (1995) 


Michalakis and Excof- 
fier (1996) 


Internet access (URL) 


See MsatManV6 


Via e-mail to primer@genome.wi.edu 


Via ftp from onyx.si.edu/protocols/ 
MsatManV6 


na 


Via www from 
http://www.rice.edu/wasps 


ftp from ftp.cefe.cnrs-mop.fr/pub/ 
msdos/genepop 


Via www from http:// 
www2.ncsu.edu/ncsu/CIL/ 
stat genetics" 

Via ftp from lotka.stanford.edu/pub/ 
programs 

Via ftp from acasun1.unige.ch/pub/ 
comp/win/amova 


http://acasun1.unige.ch/ LGB/Soft- 
ware/ Windoze/amova 


Platform 


Internet 


SPARC, PC, or Mac 


[Text file] MS-Word 


na 


Macintosh 


DOS 


Windows (PC) 


Sun, DOS, Mac, DEC 


Windows (PC) 





“Internet sites and URLs may change frequently. 
’Listing in Weir (1996) lacks /ncsu. 
Abbreviation: NA, Not applicable. 
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I. INTRODUCTION 


The mitochondrial control region containing the three-strand displacement loop 
(D loop) characteristic of vertebrate mitochondrial DNA (mtDNA) has attracted 
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the attention of systematists and population geneticists as a potential source of 
genetic markers within and among closely related species. Focus on this non- 
coding sequence stemmed from earlier reports suggesting that it was the most rap- 
idly evolving region of the mtDNA molecule (Fauron and Wolstenholme, 1976; 
Upholt and Dawid, 1977; Walberg and Clayton, 1981; Chang and Clayton, 1985). 
This prediction was borne out when estimates of the rate of substitution in 
the human control region were found to range between 2.8 (Cann et al., 1984) and 
five times (Aquadro and Greenberg, 1983) the rate for the rest of the mtDNA 
genome. 

The much publicized phylogeny of humans based on hypervariable control re- 
gion sequences (Vigilant et al., 1991) was a major stimulus for assays of control 
region sequence variability in other vertebrates including birds. Sequencing and 
mapping studies have established that the gene order around the bird control region 
has been altered relative to other vertebrates (Desjardins and Morais, 1990, 1991; 
Ramirez et al., 1993; Quinn and Wilson, 1993); the avian control region is flanked 
by the genes for tRNA?" and tRNA <“, Length variation in domain III (tRNA P 
end) has been shown to be due to a variable number of simple tandem repeats 
(Wenink et al., 1994; Berg et al., 1995), and individuals are often heteroplasmic for 
repeat number (Berg et al., 1995). 

Applications of control region sequence data to population structure and syste- 
matics of birds are few, but have mostly been instructive in revealing the increased 
resolution afforded by faster mutating sequences. Using just 178 bp of the control 
region, Quinn (1992) was able to uncover the historical mixing of two divergent 
clades of the snow goose (Anser caerulescens), confirming results of a restriction frag- 
ment length polymorphism (RFLP) study of the whole genome (Avise et al., 1992). 
Control-region sequences have elucidated the global phylogeography of the dunlin 
(Calidris alpina) (Wenink et al., 1993; 1996), gene flow and population structure 
among social groups and populations of the gray-crowned babbler (Pomatostomus 
temporalis) (Edwards (1993a,b), and apparent global panmixia in knots (Calidris can- 
utus) (Baker et al., 1994) and possibly turnstones (Arenaria interpres) (Wenink et al., 
1994). 

Although control region sequences often can be a rich source of genetic markers, 
they are not universally informative or highly polymorphic within species (Baker 
et al., 1994). However, the major problems confronting avian systematists contem- 
plating the use of control region sequences at present is the dearth of knowledge 
about sequence variation in more than a few species, and the appropriateness of this 
region as a source of genetic markers for different levels ofthe taxonomic hierarchy. 
Our objectives in writing this chapter are to summarize what is known about the 
control region of birds in terms of its organization, the location of markers within 
it, and to present exemplars mainly from our own laboratory illustrating the poten- 
tial and problems of such fast-evolving sequences in elucidating population struc- 
ture and molecular systematics of closely related taxa. 
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II. SEQUENCE ORGANIZATION 
AND EVOLUTION 


The organization and evolution of the D loop-containing region have been de- 
scribed for a number of groups of animals, including vertebrates (Brown et al., 1986; 
Mignotte et al., 1987; Saccone et al., 1987), mammals (Saccone et al., 1991), ceta- 
ceans (Hoelzel et al., 1991; Árnason et al., 1993), lepidopterans (Taylor et al. 1993), 
and insects (Zhang et al., 1995). Here we summarize the features of the control 
region of birds by aligning and comparing complete or partial sequences from a 
range of taxa. The reason for doing this is that knowledge of the organization of the 
control region is an essential precursor to using these sequences in population ge- 
netic and phylogenetic studies. For example, uncertain homology arising from hy- 
pervariability or heteroplasmy could confound measures of sequence diversity, 
molecular clock estimates, and phylogeny reconstruction. In addition, gene re- 
arrangements, large tandem duplications near the control region, and duplicate 
copies of mtDNA genes in the nuclear genome could all complicate homology 
determination. 


A. Control Region Sequence Data 
and Structural Features 


Complete control region sequences are now available from 10 avian taxa com- 
prising two Galliformes, three Anseriformes, three Charadriiformes, and two Pas- 
seriformes. Partial sequences are available for another 13 taxa from the follow- 
ing orders: Apterygiformes, Sphenisciformes, Columbiformes, Charadriiformes, 
Anseriformes, and Passeriformes (Table 1). Location of primers used to obtain 
these sequences are shown in Fig. 3.1, along with previously unpublished primer 
sequences. 


1. Gene Order 


The tRNA 9" and ND6 genes in birds are found immediately adjacent to the D- 
loop region of the molecule instead of being located between the ND5 and cyto- 
chrome b genes as in other vertebrates (Fig. 3.1). This unusual gene order has been 
found in the domestic chicken and other galliforms (Desjardins and Morais, 1990), 
the Japanese quail (Desjardins and Morais, 1991), the Peking duck (Desjardins et al., 
1990), and the lesser snow goose (Quinn and Wilson, 1993). In addition, we have 
found evidence for this rearrangement in the passerine family Fringillidae (green- 
finch and common chaffinch), the Adélie penguin, and the brown kiwi, suggesting 
that this is a universal bird phenomenon, and must be due to an event occurring 
earlier in avian or reptilian history (see Quinn and Mindell, 1996). One result ofthis 


TABLEI Control Region Sequences Included in This Study 


Species 


Ref. 





Apterygiformes 
Apterygidae 
Apterix australis 
Anseriformes 
Anatidae 
Anas acuta’ 
Anas platyrhynchos) 
Anser caerulescens 
Branta canadensis 
Cairina moschata’ 


Charadriiformes 
Alcidae 
Alca torda 
Cepphus grylle 


Uria aalge 
Uria lomvia 


Laridae 
Larus argentatus 
Larus fuscus 


Scolopacidae 
Arenaria interpres 
Calidris alpina 
Calidris canutis 

Columbiformes 

Columbidae 


Columba inornata” 


Galliformes 
Phasianidae 
Coturnix japonica 
Gallus gallus 


Passeriformes 
Fringillidae 
Carduelis chloris 
Fringilla coelebs 
Fringilla montifringilla 
Fringilla teydea 


Timaliidae 


Pomatostomus temporalis 


Sphenisciformes 
Spheniscidae 
Pygoscelis adeliae 


A. J. Baker (unpublished) 


V. Ramirez and R. Morais (unpublished) 
S.T. Liu and L.Y. Lin (unpublished) 
Quinn and Wilson (1993) 

A. J. Baker (unpublished) 

S.T. Liu and L.Y. Lin (unpublished) 


Berg et al. (1995) 

M. Kidd and V. Friesen (personal 
communication) 

Moum and Johansen (1992) 

Moum et al. (1994) 


Berg et al. (1995) 
Berg et al. (1995) 


Wenink et al. (1994) 
Wenink et al. (1994) 
Baker et al. (1995) 


M.M. Miyamoto et al. (unpublished) 


Desjardins and Morais (1991) 
Desjardins and Morias (1990) 


H. D. Marshall and A. J. Baker (1997) 
H. D. Marshall and A. J. Baker (1997) 
H. D. Marshall and A. J. Baker (1997) 
H. D. Marshall and A. J. Baker (1997) 


Edwards (1993) 


Monehan (1994) 


^ Number refers to domain studied, as defined in text. 
"Genbank accession numbers: Anas acuta: 124205; Anas platyrhynchos: 116770; Cairinu Moschata: 
L16769; and Columbia inornata: M98393. 


Region"; size (bp) 


I, IL, and Ш; 772 


I: 403 
All; 1131 
All; 1177 
I; 446 
All; 1135 


III; 216 
All; 1121 


I; 198 
I; 193 


IH; 214 
Ш; 214 


All; 1192 
All; 1072 
I; 300 


I; 451 


All; 1153 
All; 1227 


All; 1237 
All; 1234 
II and III; 638 
II and III; 619 


I; 399 


I; 507 
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FIGURE 3.1 Schematic representation of the gene order near the control region, given for the L 
strand in 5'-to-3' orientation. Forward arrows represent approximate positions of forward primers, and 
reverse arrows represent reverse primers. The shaded arrow indicates the suspected priming site of 
GSLGLU. Control region primer references or sequences are as follows (5' to 3’): Fringillidae: CRTPRO 
CCA ТСТ CCA ACT ССС AAA GC; FND6 (P. Boag, personal communication); FCRI5' TCA GGG 
TAT GTA TAA TAT GC; MATS CCA TTG TCC CCT CCA GGC GC; GFCRH CTC GTT TCC 
TAG GTT GGA GG; GSLGlu (P. Boag, personal communication); FCRI3' CAC TTG CTG TGA 
AGA GC; F304 CTT GAC ACT GAT GCA CTT TG; F389 TAT GTC CGG CAA CCA TTA CAC 
TAT; H1261 AGG TAC CAT CTT GGC ATC TTC; Adélie penguin: ABB TGT TAC TTC AAC 
САС АСС ААС; TS401H (Wenink et al., 1994); brown kiwi: BRTGLU TAG GTC TCA ACT ACA 
GAA AC; TS778H (Wenink et al., 1994); Canada goose: 16775L (Quinn, 1992); CDN-DL TAT GCA 
TAT TCG TGC ATA GA; CDN-DL2 ACG TGA AAT CAG CAA CCC G; H521 (Quinn and Wilson, 
1993); H1247 (Quinn and Wilson, 1993). 


rearrangement appears to be increased sequence divergence in the КМА €" gene 
(17.696 different between the common chaffinch and the greenfinch, about 3596 
between either of these and the domestic chicken, and about 2796 between the 
domestic chicken and the lesser snow goose) (Marshall and Baker, 1997; Quinn and 
Wilson, 1993). As pointed out by Quinn and Wilson (1993), this may be due to 
reduced functional constraint because this tRINA is no longer associated with the 
sense transcript of any gene. We have found that there is sufficient variation in the 
six bases at the 3’ end of this tRNA to prevent the primer GSLGLU (P. Boag, 
personal communication) from annealing at its designed priming site, exactly adja- 
cent to the D loop. In the common chaffinch and the greenfinch, GSLGLU appears 
to anneal to a stretch of 10 bases about 500 bp downstream of the tRNA gene, 
showing 70% similarity to the 3’ 10 bases of the primer (see Fig. 3.1). 


2. Size and Base Composition 


The control region in birds is generally larger than the corresponding sequence in 
most mammals (e.g., 910 bp in the cow; Anderson et al., 1982), but is smaller than 
that in Xenopus laevis (2134 bp; Roe et al., 1985); control region length variation is 
thought to account for the size difference between bird and other vertebrate mt- 
DNAs (Desjardins and Morais, 1990). The avian control region ranges in size from 
1072 bp in the dunlin to about 1240 bp in greenfinch, with the average size being 
1168 bp. In pairwise comparisons of related species, this variation can be often be 
attributed to relatively small (1—20 bp) insertions and deletions (indels) in the 5’ 
and 3' flanking regions, and varying numbers of tandem repeats in the 3’ domain 
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TABLEII Base Composition of Avian Control Regions and Their Respective Domains 


Nucleotide frequency 














Segment A C G T 
Whole control region 
Peking duck 28.4 31.3 16.2 24.1 
Muscovy duck 28.4 31.5 16.0 24.1 
Lesser snow goose 28.5 30.8 14.8 26.0 
Black guillemot 30.5 28.5 13.7 27.2 
Chicken 26.7 26.3 13.3 33:7 
Japanese quail 26.0 25.8 14.2 33.9 
Turnstone 29.5 29.2 14.1 27.3 
Dunlin 30.6 25.9 14.9 28.5 
Common chaffinch 29.2 27.1 14.1 29.5 
Greenfinch 29.1 28.6 13.0 29.3 
Average: 28.69 28.50 14.43 28.36 
Standard deviation: 1.393 2.085 1.004 3.252 
Domain І 

Peking duck 29.8 33.0 14.6 22.6 
Muscovy duck 32.2 33.6 14.5 19.8 
Lesser snow goose 31.5 32.5 12.1 23.9 
Chicken 26.8 29.9 13.1 30.1 
Japanese quail 25.7 29.3 14.7 30.3 
Turnstone 274 32.4 15.8 24.7 
Dunlin 29.6 29.3 16.5 24.6 
Common chaffinch 27.2 31.4 14.8 26.6 

Greenfinch 28.2 — 30.9 — 161 248 | 

Ауетаре: 28.68 31.37 14.69 25.27 
Standard deviation: 2.102 1.525 1.325 3.168 





(Continues) 


(see beyond). However, between the turnstone and the dunlin there appears to be 
at least one large (about 65 bp) indel at the 5' end of the control region. In addition, 
Quinn and Wilson (1993) described relatively large deletions in both the 5' (61 bp) 
and 3' (38 bp) flanking regions ofthe lesser snow goose as compared to the domestic 
chicken, and Desjardins and Morais (1991) discussed a 57-bp deletion in the Japa- 
nese quail relative to the domestic chicken in the 5' portion of the control region. 
Ramirez et al. (1993) also reported large deletions in both flanking regions in the 
Peking duck versus the domestic chicken. 
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ТАВІЕ П (Continued) 


Nucleotide frequency 














Segment A C G T 
Domain II 
Peking duck 14.9 33.9 20.6 30.6 
Muscovy duck 14.2 33.6 20.2 32.0 
Lesser snow goose 17.4 32.1 19.2 31.4 
Chicken 15.4 28.1 21.6 35.0 
Japanese quail 14.6 30.2 20.5 34.7 
Turnstone 21.0 25.9 18.8 35.1 
Dunlin 22.1 24.4 18.0 34.7 
Common chaffinch 23.6 26.8 19.8 29.7 
Greenfinch 22.6 27.8 19.2 30.6 
Average: 18.42 29.20 19.77 32.64 
Standard deviation: 3.643 3.227 1.026 2.082 
Domain III 

Peking duck 34.5 28.2 15:3 224 
Muscovy duck 32.7 28.3 15.2 23.8 
Lesser snow goose 32.4 28.7 14.3 24.6 
Chicken 35.9 19.6 6.7 37.8 
Japanese quail 36.1 17.6 8.3 38.0 
Turnstone 48.7 30.0 4.1 17.2 
Dunlin 49.0 24.3 4.9 21.8 
Common chaffinch 37.7 26.1 3.3 32.9 

Greenfinch 316 ша. 60 353 — 

Average: 38.29 24.88 8.678 28.17 
Standard deviation: 5.913 4.223 4.635 7.395 


The base composition of control region sequences is reported in Table II. As is 
typical for vertebrate mtDNA, the GC asymmetry between the two strands of 
DNA, resulting in one being relatively "light" (low G) and the other "heavy," also 
occurs in the control region. Overall base composition shows a paucity of G; about 
1496 of the light strand of the control region is composed of G, compared to about 
2896 for each of the other bases. Base composition does not vary greatly among 
species, but it varies significantly among regions of the D loop, and also among 
species when regions are considered individually (see next section). 
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3. Species-Pair Comparisons and Domains 


For descriptive purposes, the control region is often divided into three subregions: 
a central, more conserved domain, low in L-strand A, which is responsible for the 
three-strand displacement (D)-loop structure (Clayton, 1991), flanked by two vari- 
able A-rich domains showing extensive size and sequence variation (Saccone et al., 
1991). In mammals, the tRNA "*-adjacent domain is of varying length and greatest 
divergence, and contains three conserved sequence blocks [CSB-1 (not found in 
rats), -2, and -3], Он, and the transcription promoters for both the heavy and light 
strands (HSP and LSP). The approximately 200-bp central conserved block is char- 
acterized by low L-strand A and high G content, and harbors open reading frames 
(ORFs) of varying lengths. The tRNA "»-adjacent domain shows the highest A and 
the lowest G content of the whole control region, and includes short termination- 
associated sequences (TASs) typified by the nucleotide motif T TACAT (Saccone 
et al., 1987). Also found in the flanking domains and associated with the CSBs and 
TASs are relatively stable cloverleaf-like structures of low primary sequence simi- 
larity (Dunon-Bluteau and Brun, 1987). 

Three such domains have also been demonstrated in most avian control region 
sequences studied to date. We define them here as the following: I, the region ad- 
jacent to the tRNA“"; П, the central conserved domain; and III, the region closest 
to the tRINA"* gene. To examine these regions we took a series of species-pair 
alignments and plotted the number of variable sites between them in nonoverlap- 
ping 50-bp windows. This enabled rough designation of the boundaries of the re- 
gions; further refinement was obtained by examination of the alignments. Species- 
pairs examined were: greenfinch and common chaffinch; dunlin and turnstone; 
turnstone and black guillemot; Muscovy duck and Peking duck; Peking duck and 
lesser snow goose; lesser snow goose and domestic chicken; and domestic chicken 
and Japanese quail. Domain structure was clearly found in all comparisons except 
the black guillemot versus the turnstone and the lesser snow goose versus the do- 
mestic chicken (Fig. 3.2). Domains are defined not only by their degree of vari- 
ability but also by base composition (Table II). Specifically, domain I is AC rich, 
domain II is CT rich, and domain III is AT rich and very low in G. The increased 
conservation of the central domain II is shown in Table III; it has fewer indels and 
considerably lower among-species sequence divergence than the flanking domains. 

Brown et al. (1986) pointed out that the central domain is well preserved evolu- 
tionarily, occurring even in X. laevis, which has a different mtDNA base composi- 
tion from mammals, and that this region must therefore be functionally constrained. 


4. Conserved Structural Features 


Multiple alignment of five species of birds from three orders (Galliformes, Anseri- 
formes, and Charadriiformes) illustrates the great primary structure variability 
among control regions at this taxonomic level (Fig. 3.3). Nevertheless, conserved 
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FIGURE 3.2 Plots of number of variable sites in 50 base windows along the control region for seven 
species-pairs. Species-pair alignments were done using Clustal V (Higgins et al., 1991). 


blocks [F, D, and C boxes (Southern et al., 1988), and CSB-1] are clearly evident 
in the alignment. All three conserved boxes are located in domain II, but CSB-1 is 
located in domain III. These conserved structural features also occur across a broad 
range of birds including passerines, penguins, and kiwis. H-strand replication is 
thought to be initiated (and therefore O,, must occur) in the vicinity of CSB-1 in 
domestic chicken, human, and amphibian mtDNAs (Desjardins and Morais, 1991), 
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TABLE Ш Among-Species Variability in the Control Region of Birds 





Size Percent 
compared sequence Indel/ 
Species pair (bp) divergence  Ts/site Tv/site Ts/Tv site 
Whole control region 
Muscovy duck/Peking duck 1162 12.23 0.0740 0.0422 1:755 0.0430 
Dunlin/turnstone 1202 14.09 0.0757 0.0458 1.655 0.1165 
Chicken /Јарапеѕе quail 1233 14.21 0.0681 0.0641 1.063 0.0700 


Common chaffinch /greenfinch 1251 16.25 0.0727 0.0815 0.892 0.0248 
Peking duck/lesser snow goose 1196 24.73 0.1104 0.1196 0.923 0.0702 


DomainI 


Muscovy duck/Peking duck 448 18.33 0.1027 0.0692 1.484 0.0625 
Dunlin/turnstone 445 18.61 0.0810 0.0517 1.565 0.2292 
Chicken /Japanese quail 550 14.58 0.0745 0.0527 1.414 0.1273 
Common chaffinch /greenfinch 531 16.86 0.0791 0.0847 0.933 0.0282 
Peking duck/lesser snow goose 396 29.92 0.1212 0.1515 0.800 0.0884 
Domain II 
Muscovy duck/Peking duck 253 0.81 0.0040 0.0040 1.000 0.0198 
Dunlin/turnstone 486 6.04 0.0412 0.0185 2.222 0.0123 
Chicken /Japanese quail 309 8.52 0.0647 0.0324 1.600 0.0129 
Common chaffinch /greenfinch 368 5.52 0.0272 0.0217 1.250 0.0000 
Peking duck/ lesser snow goose 288 4.24 0.0278 0.0139 2.000 0.0174 
Domain III 
Muscovy duck/Peking duck 461 12.84 0.0846 0.0369 2.294 0.0369 
Dunlin/turnstone 271 24.27 0.1292 0.0849 1.522 0.1181 
Chicken/Japanese quail 374 18.51 0.0722 0.1070 0.675 0.0321 
Common chaffinch /greenfinch 352 26.19 0.1108 0.1392 0.796 0.0455 
Peking duck/lesser snow goose 512 33.12 0.1484 0.1543 0.962 0.0859 


Abbrevations: Ts, Transition; Tv, transversion; Indel, insertion or deletion. 


and to end in domain I. One to three copies of a direct repeat of CSB-1 have been 
reported in the domestic chicken, Japanese quail, Peking duck, and lesser snow 
goose control regions. One such repeat occurs in the black guillemot and possibly 
the dunlin. Also found in domain I of all birds examined are one or more putative 
TAS (5' TATAT 3' or 5’ TACAT 3’) elements located upstream from the most 5’ 
CSB-1-like repeat (Fig. 3.3). These TAS elements and the CSB-1-like repeats are 
thought to be involved in termination of D-loop synthesis (Doda et al., 1981). In 
addition, sequences downstream and upstream of the CSB-1 and CSB-1-like re- 
peats, respectively, are capable of forming conserved, thermodynamically stable 
tRNA-like cloverleaf structures thought to be associated with the start and arrest of 
DNA synthesis (Brown et al., 1986). 
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5. Bidirectional Transcription Promoter 


In the domestic chicken one bidirectional transcription promoter has been identi- 
fied in the tRNA ""—-adjacent domain, downstream from CSB-1 (Fig. 3.3). This is 
an AT-rich sequence containing an inverted repeat capable of forming a cruciform 
structure, and is flanked on either end by an octanucleotide sequence similar to the 
H-strand transcription start sites in mouse and X. laevis (L'Abbé et al., 1991). A 
putative bidirectional transcription promoter can readily be identified in the Peking 
duck, the Muscovy duck, the Japanese quail, and the lesser snow goose, but is less 
apparent in the other sequences examined here. 


6. Sequence Simplicity 


The AT-rich domain ІН of several species is characterized by a number of repeats 
of a microsatellite-like motif. In particular, the sequence 5’ CAACAAA 3’ is di- 
rectly repeated at least six times in four charadriiform species [lesser black-backed 
gull, Larus fuscus; herring gull, Larus argentatus; razorbill, Alca torda (Berg et al., 
1995), and the turnstone (Wenink et al., 1994)] and 20—37 times in the cuckoo 
(Cuculus canoris; Gibbs et al., 1996). Variants of this CA motif occur in the turnstone 
and dunlin (Wenink et al., 1994), the black guillemot (V. L. Friesen, personal com- 
munication; Berg et al., 1995), and an additional 11 charadriiform species (Berg 
et al., 1995). There is also a simple sequence repeat in the third domain of the brown 
kiwi (A. J. Baker, unpublished data). Repeat number varies both inter- and intras- 
pecifically, in addition to showing heteroplasmy within individuals (Berg et al., 
1995). These repeats occur adjacent or close to the КМА", and account for some 
of the length variation among species in this domain. More complicated repeats also 
occur. For example, the motif 5’ ТСАТСАСАСАТТТАТСАТСА 7” is repeated 
twice in the razorbill (Berg et al., 1995). 


B. Sequence Evolution 


In addition to transition and transversion substitutions and numerous small indels, 
length differences accumulate through variation in number of tandem repeats, and 
relatively large duplication or deletion events. Both inter- and intraspecific variation 
is more common in the two flanking domains than in the conserved central block, 
with tandem repeats occurring primarily in domain ІП and larger duplications re- 
stricted to among-species comparisons. 


1. Among-Species Variation 


The main features of among-species variation in the control region are summarized 
in Table Ш. In pairwise comparisons, sequence divergence ranges from about 12 to 
25% overall; by domain the average divergences are approximately 20% (1), 5% (П), 
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Chicken 4  -------- aat--tttatt--ttttaacctaactcccctactaagtgtacccccecttteccc-------- agggggggtat---actatgcataatcgt 77 

















Peking Duck acagctagaatagcctaataatgct--ctca---ggacc------ Cccccccecccttccccececcecaggggttgcgggggttatttggttatgcatat-cgt 88 

Snow Goose taaccgcaaacgccccaatgattctaccccatctatgccgttatgcttaacccccceccecccctccccccccegggcggggtatttggttatgcatattcgt 100 
Guillemot ctagccatcatggctaaat----ctcaaccacaggaaccaa-agacgcccccaaaatatctgagggacaccac-caccccaccc-cccacccatgtacat 93 

Dunlin tgtccaattatg--tgg------- tgegctgcatatact------ ае ааа ec-c-catactacat-accatccatgttc-c 64 

* * * ae * х xx * 
Chicken gcatacatttatataccacatatattatggtaccggtaatatatactatatatgtactaaacccattata-tgtatacgggcattaacctatattccaca 176 
Peking Duck gcatacatttatattccccatatattaac---cta-tggtcccggtaataaacactattaaccaactatcctacatgcacggactaaacc-cat--caca 181 
Snow Goose gcatagatttatatgccccatatacatacatacta-tagtaccggtaatatacattatatacgagctatcctataagcaggtgctaaacc-cat---aca 195 
Guillemot tagtacattaacttaccccata-ac--acata- tagtgcatgt--cctataccactaatatacaaggcggcataccc-cacctcaccacatcccac- 182 
Dunlin aaatccattaatttac Ccgggc--tatacacctct-------------------- C-cacccc-tcac---ccgc- 114 
*o 0 ** 4 ж * * 
Chicken tttctcccaat cattctatgcatgatc acat-actcat-ttaccctccccatagacagttccaaaccactatcaagccacctaactatgaatg 274 
Peking Duck tgtc----aacggacat--accctactatc--ggac-t-accctc-ccaacggacccagagtgaatgct---ctaatgcccaacacctcaacgcc--aca 265 
Snow Goose tgta----cacggccattaaacccttaaac--acac-t--cctac-caaaccactacaacatgaatgct---ctaggdaccataccccataatacccaatg 282 
Guillemot ---ctctagagggcagttgagtcaatggacactggaatgatacacattatccacactaaaaccattataatagt-ggactgtaca--t-aataccc---- 271 
Dunlin ---ctccagaggdtaaccgaagcaatgaacctaggaat-attcacacacactgtactaaacccatcaacttgttaggattataca--ttaaaactc---- 204 
. * * * * ж > жж 
Сһіскеп gttacaggacataaatctcactctcatgttctccccccaacaagtcacctaactatgaatggttacaggacatacatttaactaccatgttctaacccat 374 
Peking Duck -taacat-gcccccaac-cagaa-caaggcgccataatgatgaat-gcttgdac-agacataccc-tacca-acactccaaattcctctccacccacccat 357 
Snow Goose -taaccccactcacgca-cacaa-caagaccccatattaatgaat-gcttaca-ggacataccc-taaca-a-- caa -ctctctacc-acatat 365 
Guillemot | ----- ctaacttacatggcagtgcttgaa-cccatatcc-tgaatgatctca--gdgacaaatccataccatgt-- - --ctctcgctgtaccta- 350 
Dunlin ----tctaaagcgtacggcagtgctttaa-cacacgdcca-tgattggtttaa--gtgcada-cagctcgaaaa----------- ctctcgaagtgcaca- 283 
** жж аж ж ж ж * * * * 

Chicken ttggttatgct-cgccgtatcagatggatttattgatcGTCCACCTCACGAGAGATCAGCAACCC-Ctgcctgtaatgtacttcatgaccagtctcaggc 472 
Peking Duck t-actcatgaagctgcgtaccagatggatttattaatcGTACACCTCACGTGAAATCAGCAATCC-Ttgcacataatgtccgacgtgactagcttcaggc 455 
Snow Goose ---ctcatgcagtt-cgtatcagatggatttattagtcGTACTCCTCACGTGAAATCAGCAACCCGTtgcacataatgtccggtatgactagcttcaggc 461 





Guillemot --cagctgcagactaggtc-atctattagtcGTACCTCTCACTTGAAATTAGCAACCCGAcgcatgt?agatccaacgttactagcttcagaa 439 





Dunlin --ccagt-cgtaccaggtt-atttattaatcGAGCTCCTCACGTGAAATCAGCAACCCGGCgtaagtaatgtcctgcgttactagcttcaggc 372 
Жж OX o XX XXX XXX XXX жж жж XX жж * *ox ok *ox* ** o REE 
F box 
Chicken ccattctttccccctacaccectegccctacttgecttccaccgtaCCTCTGGTTCCTCGGTCAGGCACATcccatgcataactcctgaactttcTCACT 572 
Peking Duck Ccatacgttccccctaaacccctcgcecctcctcacatttt--tgcgCCTCTGGTTCCTCGGTCAGGGCCATcaattgggtt-cactcacct---cTCGCC 549 
Snow Goose ccatacgttccccctaaacccctegccctcctcacatttt--tgcgCCTCTGGTTCCTCGGTCAGGGCCATccattgggtt-cactcacctctccTTGCC 558 
Guillemot tcattcattccccectaaacccctagcccaacttgctcttt--tgcaCCTCTGGTTCCTCGGTCAGGGCCATaacttgactagtcoctctcaa---cTTGTA 534 
Dunlin Cccattctttccccctaaaccc-tagcacaacttgctcttt--tgcgCCTCTGGTTCCTATGTCAGGGCCATaaataggttagtactcataa---cTTGCT 466 
EG ko OR X OR A dk OX OX Ж ЖҰЖ Xo x ox KORE do e x x x жж * 


FIGURE 3.3 Multiple alignment of five species generated by Clustal V (Higgins ег al., 1991). Conserved sequence blocks (CSB-1 
and F, D, and C boxes) are capitalized, possible CSB-1 repeats are underlined, possible TATAT or TACAT termination-associated 
sequences are in boldface, and the putative bidirectional transcription promoter sequence is double-underlined. 


£9 


Chicken 
Peking Duck 
Snow Goose 
Guillemot 
Dunlin 


Chicken 
Peking Duck 
Snow Goose 
Guillemot 
Dunlin 


Chicken 
Peking Duck 
Snow Goose 
Guillemot 
Dunlin 


Chicken 
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Chicken 
Peking Duck 
Snow Goose 
Guillemot 
Dunlin 


Chicken 
Peking Duck 
Snow Goose 
Guillemot 
Dunlin 


D box 


TTTCACGAAGTCATCTG-TGGATtatct-tcccctctttagtcegtgatcgcggcatcttctc-tcttctattgctgttggttccttctcttttt---gg 
CTTCAAAGTGGCATCTG-TGGAAtactt-ccaccatctcaatgcgtaatcgcggcatcttccagctttttggcgcctctggttccttttattttttcogg 
CTTCAAAGTGGCATCTG-TG-AGTactt-tcaccttctcaatgcgtaatcgcggcatgttccaqctttttggcgcctctggttcctcttattttttccgg 
CTTCACCGATACATCTGGTCGGCtatttatcatcatct-cacccgtgatcgcgacat--ccgaccgtcttggcacttttggttcctttt--ttttttctt 
CTTTACGAATACATCTGGTTGGCtatatctcaccattttcgtccgtgatcgcggcat--tocaaaattcttatacttttggttcctttt--ttttttggg 
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and 2396 (Ш). Similarly, numbers of transitions, transversions, and indels are re- 
duced in the central domain, although transition-to-transversion ratios appear rela- 
tively constant among domains. Large deletions account for much of the length 
variation between similar species. For example, in a comparison of the Peking duck 
and the domestic chicken, a large deletion in domain I encompasses one of the 
CSB-1-like repeats found in the domestic chicken, while the third domain is char- 
acterized by the absence of more than 100 bases in the AT-rich region separating 
the CSB-1 and the transcription promoter (Ramirez et al., 1993). In addition, varia- 
tion in the type and number of tandem microsatellite-like repeats occurs between 
species. In the dunlin, for instance, the primary type of CA repeat is CAAA (18—23 
copies), whereas the turnstone shows an array of repeats such as CAACAAA (12— 
14 copies and CAACAAACAAA (4—8 copies) (Wenink et al., 1994). More re- 
cently, Berg et al. (1995) reported variation in both type and number of repeats in a 
study of 15 charadriiform species. Clearly, the most appropriate region for among- 
species studies is the central conserved domain II, where alignment and homology 
are the least ambiguous. 


2. Within-Species Patterns of Variability and Haplotypic Diversity 


Results from six avian control region population studies are presented in Table IV. 
Four of these focused on domain I, a fifth on the first two domains, and a sixth 
looked at domains П and III. As many as 27% of surveyed sites were found to be 
variable (in Adélie penguins) compared to as few as 1.2% (in knots), although 
sample sizes varied. In all studies transitions greatly outnumbered transversions and 
alignment gaps, so neither homology nor site saturation appears to be a major prob- 
lem at this level. Except for the knots, all species are characterized by high haplo- 
typic diversity regardless of whether the first or third domain was studied. Thus, in 
most cases the two flanking domains of the control region provide a rich source of 
genetic markers for population studies. 


III. POPULATION STRUCTURE AND 
INTRASPECIFIC TAXONOMY 


Thorough analysis of intraspecific sequence variation leads inevitably to considera- 
tion of the population genetic processes responsible for major phylogenetic subdi- 
visions in gene trees, and to consideration of taxonomic recognition of these dis- 
crete clades as subspecies, phylogenetic species, or biological species (Avise et al., 
1987; Avise, 1989; Avise and Ball, 1991). In this section we summarize studies of 
population structure within species of birds as recorded in their control region se- 
quences, and examine whether or not the gene trees reflect currently held views of 
intraspecific taxonomy based on other characters. 


ТАВІЕ ІУ  Within-Species Variability in the Control Region of Six Species of Birds 








Species n? 
Common chaffinch 166 
Gray-crowned babbler 163 
Lesser snow goose 81 
Knot 25 
Dunlin 155 
Adélie penguin 82 


Domain 


II and III 
I 
I 
I 
Тапа II 
I 


No. of 


bases 


598 
400 
178 
255 
608 
300 


No. of 
haplotypes 
65 
86 
26 

7 
39 
63 


h^ 


0.955 
0.973 
0.827 
0.449 
0.887 
0.982 


sites 


60 
88 
21 

3 
38 
65 


Transition Tranversion 


sites 


22 
6 
1 
0 


13 


Gaps 


N N O o о к 





^Number of individuals sequenced. 


^ Haplotypic diversity, where h = (1 — E x2)/(n — 1), x, is the frequency of a haplotype, and n is the sample size. 
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A. Global Phylogeography of the Dunlin 


One ofthe clearest examples of population structure revealed by the increased reso- 
lution of control region sequences over that afforded by other mtDNA sequences is 
in the dunlin (C. alpina). Dunlins have a circumpolar breeding range in the Holarc- 
tic, and migrate along flyways to their wintering grounds in temperate and tropical 
regions north of the Equator (Greenwood, 1984). Both sexes display high natal 
philopatry to their breeding sites, thus suggesting that dunlins might be genetically 
structured across their breeding range. 

An initial study of 73 dunlins sampled over most of this range (but lacking ma- 
terial from eastern Siberia) screened for sequence variation in both a short cyto- 
chrome b fragment (302 bp) and two control region fragments (295 and 313 bp, 
respectively) located in domains I and 11 (Wenink et al., 1993). Most variability 
occurred in segment I in the first domain (30 of 42 variable sites in both fragments). 
Of the 50 haplotypes detected in the total 910 bp of mtDNA sequence, only 8 
variable sites were located in the cytochrome b fragment whereas more than 5 times 
as many variable sites (42) were found in the 2 control region segments. Haplotypic 
diversity was also correspondingly higher for the control region; only 10 haplotypes 
were defined by the cytochrome b segment compared to 33 for the control region 
segments. Thus of the total of 35 haplotypes detected by the cytochrome b and 
control region sequences, only 2 unique haplotypes were added by assaying the 
more slowly evolving protein-coding gene. The rate of substitution ( Jukes- Cantor 
corrected d = 0.034 + 0.017 substitutions/site) in domain I of the control region 
of the dunlin is similar to the rate at synonymous sites (d, = 0.027 + 0.012) in 
cytochrome В, suggesting that the faster evolution of this control region segment 
emanates from lower selective constraints rather than an elevated mutation rate. 
Conversely, the rate of substitution (d = 0.006 + 0.005) in the much more con- 
strained domain II of the control region approximates that at first positions in co- 
dons (d, = 0.007 + 0.005) in cytochrome b. 

Phylogenetic analysis of the sequences partitioned them into five major phylo- 
geographic groups, but clearly a larger analysis was required to confirm this pattern 
on a global scale. When control region sequences of 155 dunlins from 15 breed- 
ing populations (including samples from eastern Siberia and a much larger repre- 
sentation from Europe, Iceland, and Greenland) were subsequently analyzed, the 
existence of the 5 major monophyletic clades was confirmed, and no additional 
phylogeographic groups were found (Fig. 3.4). Only six more control region hap- 
lotypes were detected in this larger sampling of dunlins. A hierarchical analysis of 
molecular variance (Excoffier et al., 1992) grouping the total sample into five geo- 
graphic regions partitioned 76.3% of the total molecular variance among regions, 
2.3% among populations within regions, and 21.4% within populations (Wenink 
et al., 1996). Despite their high potential for dispersal, dunlins have a much more 
subdivided population structure than do humans, for example, where 22% of the 
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FIGURE 3.4  Neighbor-joining tree showing genealogical relationships among 39 control region hap- 
lotypes found in 155 dunlins. Phylogeographic groups and subspecies designations are shown to the 
right, and sample sizes are given in parentheses. Asterisks indicate haplotypes that were found in putative 
immigrants in other groups. The tree is rooted with a sequence from the purple sandpiper (Calidris 
maritima). (Redrawn from Wenink et al., 1996. Reprinted with permission of the publisher.) 


molecular variance in their control region sequences is distributed among major 
geographic groups (Excoffier et al., 1992). 

Times of divergence of the five major phylogeographic groups of dunlins were 
estimated on the basis of the amount of sequence divergence among them corrected 
for within-group variation. Estimates of divergence times of the major dunlin line- 
ages all fall in the late Pleistocene, suggesting that they arose by fragmentation of 
populations in isolated tundra refugia (Wenink et al., 1996). 

The imprint of this historical population subdivision has most likely been main- 
tained to the present by strong philopatry on the breeding grounds. Mixing of hap- 
lotypes indicative of gene flow was observed only in dunlins breeding in Europe 
and Siberia. Eight of 117 Dunlins sampled across arctic Eurasia had haplotypes that 
did not belong to the phylogeographic group in which they were found breeding 
(see Fig. 3.4), implying that they were immigrants. Cladistic measures of gene flow 
among Eurasian populations were derived by inferring the minimum number of 
historical migration events from the tree in Fig. 3.4, and translating them into values 
of Nm, where N is effective population size of females and m is the migration rate 
(Slatkin and Maddison, 1989). When values of Nm are less than about four individ- 
uals per generation, gene flow is insufficient to prevent population structure evolv- 
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FIGURE 3.5  Parsimony network of control region haplotypes found in 85 dunlins from the European 
phylogeographic group. Substitutions are indicated by slashes on the network, and homoplasies are in- 
dicated with numbers as in Wenink et al. (1996). The frequency of each haplotype is shown in parenthe- 
ses. Haplotypes found in Iceland are shaded gray. 


ing by genetic drift alone (Birky et al., 1983). Values of Nm among Eurasian locales 
are low for all comparisons except between northern and southern Norway, imply- 
ing that gene flow is strong enough to prevent local differentiation only in Norway. 
Thus we might expect to see some evidence of evolving population structure in 
the European phylogeographic group that has been extensively sampled. Geo- 
graphic localization of haplotypes supports this conjecture; for example, haplotypes 
EUR16—EUR20 were found only in Norway and Sweden, and haplotypes EUR12 
and EUR14—15 occurred at high frequency or exclusively in Iceland (Wenink et al., 
1996). Corrected sequence divergence of 0.13% between the Iceland population 
and the rest of the European phylogeographic group suggests they diverged about 
9000 years ago, and that Iceland was colonized following retreat of ice sheets about 
12,000 years ago (Denton and Hugues, 1981). 

The high sequence similarity among European dunlins suggests that they also 
have only recently been fragmented following post-Pleistocene reinvasion of new 
breeding grounds exposed by the melting ice sheets. The widespread distribution 
and abundance of haplotype EUR1 in all sampled European locales is consistent 
with this hypothesis. Coalescent theory predicts that the oldest haplotype in a line- 
age should be most widely distributed among subpopulations (Takahata, 1988), and 
it should also have the highest number of mutational connections to other haplo- 
types in a parsimony network (Crandall and Templeton, 1993) (see Fig. 3.5). EUR1 
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is not only central in the network but it is also basal in the neighbor-joining tree 
(Fig. 3.4), indicating that it is ancestral to the other European haplotypes. 

The five major phylogeographic groups in the control region gene tree corre- 
spond with five subspecies defined on morphological criteria or geographic sepa- 
ration (see Fig. 3.4). Interestingly, two currently disputed subspecies (Calidris alpina 
hudsonia in central Canada, and C. a. centralis in central Siberia) are supported in the 
tree, whereas three others (C. a. arctica in Greenland, C. a. articola in northern 
Alaska, and C. a. schinzii in the Baltic region of Europe) are not represented by 
phylogenetic discontinuities. Caution is warranted in basing taxonomic conclusions 
solely on one gene tree, as strong selection on phenotypic characters might easily 
precede monophyly of haplotypes in newly isolated populations. Conversely, the 
previous lumping of the populations breeding in central Canada and in southern 
Alaska under C. a. pacifica on the basis of their long bills (Greenwood, 1986) is 
clearly invalidated by the major genetic differences between them. 


B. Bottlenecking and Recent Population 
Expansion in Knots 


A study of the control region sequences of 25 knots sampled from 10 populations 
representing 4 of the 5 subspecies recognized around the world (Tomkovich, 1992; 
Piersma and Davidson, 1992) came to the surprising conclusion that the species is 
effectively globally panmictic (Baker et al., 1994). Furthermore, knots are depau- 
perate genetically, as only seven haplotypes were found worldwide, all closely re- 
lated and differing by one to three substitutions. Only seven variable sites were 
found in domain I, and none elsewhere in the control region. All substitutions were 
transitions. These observations most likely indicate that knots were bottlenecked 
down to a small population size in the late Pleistocene, and have expanded to the 
present broad distribution only in the last 10,000 years or so (Baker et al., 1994). 
The alternative is that a selectively advantageous kind of mtDNA arose relatively 
recently and has swept through the populations, replacing older haplotypes as it did 
so. The control region is noncoding and sequences seem to be selectively neutral 
(Edwards, 1993b), but the mitochondrial genome is a single linkage group, and 
selection on an advantageous mutation in any gene could in theory sweep the entire 
haplotype toward fixation. However, the hypothesis that the low variability ema- 
nates from a severe population contraction rather than a selective sweep is also sup- 
ported by an assay of 37 allozyme loci. These nuclear genes had an average hetero- 
zygosity (H = 0.035) at the low end for birds (Baker and Strauch, 1988). 

Under these demographic conditions, even control region sequences will not be 
useful in supplying genetic markers for detecting population structure and phylo- 
genetic breaks in gene trees indicative of subspecies. Even if bottlenecking had been 
less severe and thus many more variable positions had been found, the average time 
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to coalescence of haplotypes in avian mtDNA gene trees would almost certainly be 
much longer than the time since the population expanded (Moore, 1995). Incom- 
plete lineage sorting would make it difficult or impossible to track the population 
or subspecies splits with the control region gene tree. 

Another difficulty illustrated by the knot sequences is that the populations are 
unlikely to be in equilibrium with respect to mutation and genetic drift. Not only 
will this violate assumptions of methods that estimate population subdivision and 
gene flow, but ancestral polymorphisms distributed in the wave of population ex- 
pansion will indicate that the species is globally panmictic when it actually may not 
be. Under such circumstances it may be possible to track population histories with 
even faster evolving microsatellite loci, but the number of loci required to do this 
could be depressingly large. For example, to track even more widely spaced specia- 
tion events in birds, Moore (1995) has estimated it may take 16 nuclear loci evolving 
at an appropriate rate. 


C. Colonization Routes of Chaffinches in the 
Atlantic Islands 


Chaffinches are ideal for investigating the linkage between microevolutionary pro- 
cesses and speciation because they provide a variety of windows through time to 
trace the diversification of lineages at various levels in the taxonomic hierarchy. At 
least two colonizations of ancestral chaffinch stock to the isolated island archipela- 
goes in the Atlantic from continental Europe and/or Africa seem to have occurred 
in the last few million years or so (Grant, 1979), an earlier one that culminated in 
the distinctive blue chaffinch (Fringilla teydea), and a later one that populated the 
Azores, Madeira, and the Canary Islands with well-differentiated subspecies of the 
common chaffinch (Fringilla coelebs). All island birds share a common phenotypic 
theme of larger body size and legs, shorter wings, and a dull orange-pink breast and 
blue dorsal plumage, suggesting they were derived from an expansive wave of colo- 
nization from the continents. In addition, common chaffinches appear to have ex- 
panded northward out of Africa into southern Europe within the last 100,000 years, 
and over the last 15,000— 3000 years have colonized Europe behind the retreating 
ice sheets. 

Studies of morphometrics (Grant, 1979; Dennison and Baker, 1991), allozymes 
(Baker et al., 1990), and songs (Lynch and Baker, 1986, 1993, 1994) have confirmed 
the greater amount of population differentiation among islands in the Canaries than 
the Azores, consistent with reduced population sizes, lower gene flow, and en- 
hanced genetic drift (and possibly selection) in the former archipelago. However, 
none of these character sets were able to resolve the origins of the island forms, nor 
were they able to distinguish between hypotheses of multiple invasions from several 
continental sources or a single invasion from Africa or Europe. Resolution of the 
colonization route(s) is important in testing hypotheses of convergent evolution in 
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TABLE V Regional Samples of Common Chaffinches Analyzed 
for Control Region Variation, Haplotypic Diversity (h), and 
Average Sequence Divergence (d) 





No. of 
Region n haplotypes h d 

Continent 

Southern Europe 28 16 0.86 0.46 

Iberia 19 11 0.79 0.54 

Northern Europe 20 11 0.82 0.46 

Africa 29 13 0.83 2.3 
Atlantic islands 

Azores 25 9 0.74 0.33 

Madeira 10 3 0.66 1.3 

Canaries 35 11 0.83 1.4 


plumage coloration of island birds (assuming the Azores were colonized by Euro- 
pean birds with olive-brown backs and brick-red breasts today represented by Frin- 
gilla coelebs coelebs, and the Canaries by African birds with blue backs and pale pink 
breasts now referred to as Fringilla coelebs africana). The hypothesis that the bigger 
blue chaffinch and the smaller common chaffinch underwent character displace- 
ment after the latter invaded the Canaries a second time also hinges on colonization 
routes. For example, if the source of the later invasion of the common chaffinch 
was from the nearest archipelago of Madeira (rather than from Africa) then char- 
acter displacement is unlikely because the Madeiran birds are even smaller than their 
Canaries conspecifics. 

We sought to reconstruct colonization routes using genetic markers obtained by 
sequencing 598 bp in domains II and III of the control region of 166 individuals 
from 19 populations of common chaffinches and an outgroup sample of blue chaf- 
finches (Table V) (H. D. Marshall and A. J. Baker, unpublished data). The sequences 
are highly polymorphic; 81 variable sites were found that defined 65 haplotypes. 
Forty of the haplotypes were unique to individual birds, and the remaining 25 oc- 
curred in 2-21 birds. The ratio of transitions to transversions was much lower for 
the control region (2.73:1; 60 transitions, 22 transversions) than is typically found 
in intraspecific comparisons of coding sequences such as cytochrome 6 for passeri- 
nes (20:1; Edwards et al., 1991), indicating the faster rate of evolution in the control 
region. 

A condensed neighbor-joining tree depicting relationships among the control 
region haplotypes of regional groups of chaffinches 1s shown in Fig. 3.6. Salient 
features of the tree for reconstructing the routes of colonization are as follows: 
(1) the continental and Atlantic island haplotypes are monophyletic and form an 
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FIGURE 3.6  Neighbor-joining tree showing genealogical relationships among common chaffinch 
control region haplotypes from Atlantic islands, north Africa, and Europe. The tree is rooted with a 
sequence from the blue chaffinch. 


unresolved trichotomy, consistent with only one wave of colonization; (2) the 
grouping of the geographically intermediate Madeiran haplotypes with both the 
Azores and Canaries clades suggests secondary gene flow or admixture of the island 
populations; (3) coalescence of the extant continental haplotypes is much more re- 
cent than colonization of the islands, judging from the much shorter branch lengths 
in the Africa—Europe clade, consistent with their split about 100,000 years ago; and 
(4) two of four haplotypes from Nefza in Tunisia are basal in the tree, suggesting 
an African origin for all the island and European colonists. The most parsimonious 
hypothesis consistent with the tree is that the Atlantic island populations are de- 
rived by a sequential wave of colonization from Africa to the Canaries, then to 
Madeira, and finally to the Azores. Character displacement of blue and common 
chaffinches following their secondary contact in the Canaries is thus plausible, but 
postulation of convergent phenotypic evolution of birds from different archipela- 
goes is unnecessary. 

The control region sequences are also informative with respect to the pheno- 
typically based intraspecific taxonomy of common chaffinches in the Atlantic is- 
lands. In the Canaries, birds from the older eastern islands of Gran Canaria, Tener- 
ife, and Gomera are referred to as Fringilla coelebs canariensis. Additional taxonomic 
uncertainty is associated with the two populations on the younger western islands 
of La Palma and Hierro. Are they both referable to as F. c. palmae or should the 
Hierro population be recognized as a separate subspecies F. c. ombiosa? Only two 
kinds of sequences were found in these populations, and they cleanly distinguished 
the eastern and western island populations. This indicates that only two subspecies 
(F. c. canariensis and F. c. palmae) are warranted on the available mtDNA evidence. 
In addition, the recognition of separate subspecies in the Azores (F. с. moreletti) is 
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well justified. Madeiran haplotypes are paraphyletic in the gene tree because of 
historical gene flow between the archipelagoes, but this is a classic example of when 
gene trees will be positively misleading about population splits and intraspecific 
taxonomy. The Madeiran population warrants subspecific recognition as F. c. mad- 
erensis on the basis of its distinctive plumage and smaller size, attributes that have 
presumably been maintained by selection strong enough to overcome occasional 
gene flow between the archipelagoes. 


D. Intraspecific Variation in Canada Geese 


Among birds, the Canada goose (Branta canadensis) shows the most extreme ex- 
ample of intraspecific geographic variation in body size, which along with variation 
in plumage characters has been the basis for splitting the species into 11 subspecies. 
Previous studies of relationships between subspecies using RFLP analysis of the 
mtDNA genome (Shields and Wilson, 1987; Van Wagner and Baker, 1990) and 
sequences of a 612-bp fragment of the cytochrome b gene (Quinn et al., 1991) 
resolved the haplotypes into two distinct clades corresponding to large-bodied and 
small-bodied subspecies. However, the cytochrome b sequences were identical for 
each of the subspecies (Branta canadensis maxima, B. c. moffitti, B. c. occidentalis, and 
B. c. fulva) assayed by Quinn et al. (1991) in the big-bodied clade, despite obvious 
phenotypic differences between these taxa. This is consistent with reinvasion of 
their lower Arctic breeding grounds in the last 10,000— 14,000 years or so. Never- 
theless, the RFLP approach was successful in distinguishing most subspecies (Van 
Wagner and Baker, 1990), so we decided to sequence domain I of the control re- 
gion in Canada geese to see if we could locate genetic markers for this purpose (A. J. 
Baker and G. F. Shields, unpublished data). Fifty-five variable sites were located in 
a 396-bp fragment corresponding to region І in the snow goose. As for both pre- 
vious approaches, the control region sequences recovered two distinctive clades 
corresponding to big-bodied and small-bodied subspecies (Fig. 3.7). The increased 
resolution of the control region over the cytochrome b sequences can be seen 
within each of these clades where all subspecies except B. c. moffitti and B. c. interior 
are distinguished. However, these results are based on sample sizes of one or two 
birds per taxon, and we need to analyze larger samples to check the magnitude of 
within-species polymorphism and its possible effects on the topology of the tree (see 
Edwards, Chapter 9 in this volume). 

Another feature of the control region tree is that the two sequences of the small- 
bodied B. c. taverneri are paraphyletic, one specimen from the state of Washington 
having an mtDNA sequence typical of big-bodied subspecies (possibly by hybrid- 
izing with B. c. parvipes) and the other possessing a haplotype typical of small-bodied 
subspecies. Misidentification of subspecies is also another possibility for this result. 
Assays of allozymes in Canada geese support the hypothesis of hybridization be- 
tween subspecies in the large-bodied and small-bodied groups (Van Wagner and 
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FIGURE 3.7  Neighbor-joining tree showing genealogical relationships among control region haplo- 
types found in 10 subspecies of Canada goose. The tree is rooted with a sequence from the Brent goose 
(В. bernicla). The big-bodied subspecies occur in clade I and the small-bodied subspecies іп clade II. 
Bootstrap support for these two clades is indicated at their bases. 


Baker, 1986, 1990), leading to male-biased gene flow as males mate with females in 
mixed winter flocks and return to breed at the natal area of their mates. This pre- 
serves mtDNA population structure and the inferential value of matriarchal gene 
trees because of haploid transmission of this genome through females, but scrambles 
nuclear genes and makes them unreliable for inferring taxonomic relationships. 


E. Phylogeography of 
Gray-Crowned Babblers 


Gray-crowned babblers are cooperative breeders that are widely distributed in Aus- 
tralia and southern Papua New Guinea. The Carpentarian biogeographic barrier in 
northwest Queensland marks the boundary between two subspecies, an eastern 
paler-breasted Pomatostomus temporalis temporalis, and a western P. t. rubeculus with a 
reddish-brown breast and darker upperparts (Simpson and Day, 1984). To investi- 
gate population structure in this highly social species, a 400-bp segment in domainI 
of the control region of 163 gray-crowned babblers sampled from 12 populations 
across Australia and Papua New Guinea was sequenced by Edwards (19932). Of the 
400 sites, 96 were variable across all 163 birds. Eighty-six haplotypes were detected, 
44 of which occurred in the 69 P. t. temporalis and 42 in the 94 P. t. rubeculus. 
Haplotypes of each subspecies fall into two distinct clades in the gene tree (see 
Fig. 9.6 in Edwards, Chapter 9 in this volume) constructed from these sequences, 
corroborating intraspecific taxonomy based on phenotypic characters. Sequence 
divergence within subspecies was pronounced too; the maximum value was 8% 
between the most divergent haplotypes in P. t. temporalis. Because of the large 
amount of sequence divergence within each subspecies, it is important to correct 
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for this source of variability when using average sequence diversity to date their 
divergence, as pointed out by Edwards and Kot (1995). This places the subspecies 
split between 275,000 and 425,000 years ago. 

Cladistic analysis of gene flow based on the control region tree revealed two 
major surprises that were not evident from short-distance movements of adults 
(usually only a few territories, with an observed maximum of 25 km). First, the 
genealogies for each subspecies suggest that long-distance gene flow occurs fre- 
quently among populations more than 1000 km apart. Second, sequences from un- 
related migrants were found in 10?6 of social groups. Despite this, the fraction of 
sequence diversity that was apportioned among populations within each subspecies 
was quite large (For = 0.53 and 0.66 for P. t. temporalis and P. t. rubeculus, respec- 
tively). Hence there appears to be considerable opportunity for kin selection within 
each subspecies as gene flow on this scale is probably insufficient to counteract it. 


F. Recent Mixing of Lineages in 
Adélie Penguins and in Snow Geese 


Adélie penguins (Pygoscelis adeliae) breed colonially in suitable habitat in Antarctica, 
and mark-recapture studies indicate that both parents are highly philopatric to their 
natal areas (Ainley et al., 1983). Most of the 240,000 + 24,000 Adélies on Ross 
Island breed at one of three major colonies (Cape Bird, Cape Royds, and Cape 
Crozier) in geographically separated locales (Taylor et al., 1990). The glacial history 
of the Ross Sea region suggests that the colonies on Ross Island have been accessible 
to breeding birds from about 6500 to 10,000 years ago (Young, 1981), and thus they 
must have colonized these breeding sites from colonies that existed elsewhere dur- 
ing the Pleistocene. 

To investigate whether population structure has developed in Ross Island in the 
last 10,000 years, Monehan (1994) sequenced a hypervariable region of 300 bp in 
domain I of the control region of 81 Adélies from the three colonies. Seventy-six 
variable positions were found distributed across the sequences in 3 serial arrays and 
single-base substitutions. The ratio of transitions to transversions was 5:1 (65 tran- 
sitions, 13 transversions), and there were two insertions. Haplotypic diversity was 
very high; 67 haplotypes were found in the 81 sampled Adélies (Table VI). The 
most striking feature of the sequences was that they can be divided into two dis- 
tinctive types differing on average by 5.196 (Kimura two-parameter corrected dis- 
tance). This basic dichotomy was recovered in a neighbor-joining tree, but haplo- 
types from the three colonies were scattered throughout these two clades. 

A frequency or “mismatch” distribution of pairwise sequence divergences among 
haplotypes naturally generated a bimodal curve, one peak near the ordinate repre- 
senting the close similarity among haplotypes within each clade, and the other 
peak reflecting the more distant comparisons among haplotypes from different 
clades (Fig. 3.8). One interpretation of this bimodality and high haplotypic diversity 
is that the Ross Island population (and its ancestral precursors) has maintained a large 
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TABLE VI Control Region Variation in Adélie Penguins 
from Three Colonies in Ross Island“ 


Sequence 

No. of divergence 
Region N haplotypes Diversity (96) 
Cape Bird 31 28 1.024 3.88 
Cape Crozier 27 23 1.014 3.11 
Cape Royds 23 21 1.036 2.55 
Ross Island 81 67 0.982 3.86 


“Modified From Monehan (1994). 
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FIGURE 3.8 Mismatch distributions of pairwise comparisons among control region haplotypes found 
in Adélie penguins and in a sample of dunlins from Europe. 
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and constant size over its evolutionary history (Majoram and Donelly, 1994; Rogers 
and Harpending, 1992). This would require that the number of breeding females 
has been roughly equal to the current census number of breeding females [Na f = 
240,000] for the history ofthe population, and this seems unlikely on purely demo- 
graphic grounds. Bimodal mismatch distributions can also arise under different 
demographic conditions than constant population size, further weakening this line 
of reasoning (Slatkin and Hudson, 1991; Majoram and Donnelly, 1994). The best 
explanation for the widespread distribution of haplotypes from both clades across 
Ross Island is that of recent mixing of two populations that differentiated in allopa- 
try during the Pleistocene. The unimodal mismatch distribution for 25 European 
dunlins (C. a. alpina) is, by contrast, indicative of an expanding population (Rogers 
and Harpending, 1992; Rogers, 1995). 

Recent mixing of two populations that differentiated in separate Pleistocene 
refugia is also illustrated by an analysis of control region sequence variation in 
domain I of the lesser snow goose in North America (Quinn, 1992). Evidence for 
a vicariant origin of the populations comes from the presence of two types of se- 
quences (clades I and II) differing by 6.7%, and which are distributed across the 
entire subspecies range. The sequences were not correlated with the color phase of 
the birds, or with sex or sampling locale. Although these divergent haplotypes could 
again have been maintained in a single panmictic population that has been large 
throughout its history, the most compelling argument against this interpretation is 
that the refugial populations almost certainly would have encountered bottlenecks 
of reduced size (Quinn, 1992). 

Using a molecular clock of 20.8% divergence per million years for control 
region I, the two allopatric populations split about 350,000 years ago. Birds from 
one of these populations carrying clade II haplotypes are thought to have spread 
across the continent, separating into two more populations about 110,000 years ago 
that are today part of the eastern and western populations. Then birds with a differ- 
ent type of mtDNA (clade I) from the other refugium more recently spread across 
the continent and mixed with the clade II birds. This study is also instructive in 
warning against the potential pitfall of mixing nuclear and mitochondrial control 
region sequences; waterfowl are particularly problematic in this regard as many spe- 
cies we and others (Quinn and White, 1987; Quinn, 1992; K. Scribner, personal 
communication) have studied have translocated copies of most mitochondrial genes 
and the control region in the nucleus. 


IV. HIGHER LEVEL SYSTEMATICS 


There are no published applications of the use of control region sequences in the 
molecular systematics of birds at and above the species level. To investigate the 
utility of such fast-evolving sequences among closely related species we (Marshall 
and Baker, 1997) sequenced a 609-bp segment of the control region of the subspe- 
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FIGURE 3.9 (А) Plot of transitions versus transversions for pairwise comparisons of control region 
sequences of the finches shown in the maximum likelihood tree (B). The points to the right are com- 
parisons between Fringilla species and Carduelis chloris. (B) Maximum likelihood tree among the finch 
species, computed using empirical base frequencies and a transition:tranversion ratio of 1.64 computed 
in Phylip 3.41 (Felsenstein, 1991). 


cies of common chaffinch (F. coelebs), and the other two species in the genus, the 
blue chaffinch (F. teydea) and the brambling (Fringilla montifringilla). We also se- 
quenced the same segment from the greenfinch (Carduelis chloris) as an outgroup. 
The sequenced segment encompasses most of domains II and III of the control 
region. 

As might be expected in comparisons between two genera, the C. chloris se- 
quence differs in length from that of the species of Fringilla, as the former possesses 
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three 1-base deletions, three single-base insertions, two 2-base insertions, and one 
6-base insertion relative to the latter. There are only two alignment gaps within 
Fringilla. Thus in general there is no real problem associated with alignment of the 
segment we sequenced. The number of variable sites increases with higher levels in 
the taxonomic hierarchy; there are 57 variable sites within Е. coelebs and its subspe- 
cies, 98 within Fringilla, and 153 when C. chloris is included. A plot of transitions 
versus transversions for the finch sequences indicates that within the genus Fringilla 
the relationship is linear, but it quickly plateaus as saturation occurs in comparisons 
between Fringilla and Carduelis (Fig. 3.9A). Kimura two-parameter corrected dis- 
tances (X100) range from 1.4—11.0 within Fringilla to 15.6—19.6 between the two 
genera. 

Alternative use of C. chloris, F. montifringilla, or F. teydea as an outgroup results in 
some instabilities of the genealogical relationships among the subspecies of F. coelebs 
in the ingroup with the neighbor-joining method, but not with maximum likeli- 
hood (Fig. 3.9B). This problem does not arise from differences in base composition 
of the sequences (see Hasegawa and Hashimoto, 1993; Lockhart et al., 1994) be- 
cause they all have similar compositions that are AT rich in the L strand. We con- 
clude that control region sequences are likely to be most useful in higher systematics 
in determining phylogenetic relationships among closely related species, but even 
then careful consideration will need to be given to choice of outgroups and tree- 
building methods. 
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I. INTRODUCTION 


A. Avian Phylogenies Inferred from 
Mitochondrial DNA 


Mitochondrial DNA (mtDNA) seemingly has enormous value for resolving the 
phylogenies of recently evolved avian taxa, and numerous phylogenetic studies of 
avian groups have been carried out using mtDNA variation as a source of characters. 
Until recently these studies were based on restriction site variation (Kessler and 
Avise, 1984, 1985; Avise et al., 1990; Zink and Avise, 1990; Zink and Dittmann, 
1991, 1993a,b; Zink et al., 1991a,b; Bermingham et al., 1992; Tarr and Fleischer, 
1993; Zink, 1993), but this technology has been supplanted by the technology of 
amplifying by the polymerase chain reaction (PCR) specific genes or portions of 
genes that are then sequenced (Saiki et al., 1988; Kocher et al., 1989; Kocher and 
White, 1989; Kocher, 1992). The mitochondrially encoded cytochrome b gene 
(cyt b) has been used most often in avian phylogenetic studies based on DNA se- 
quences (Edwards and Wilson, 1990; Edwards et al., 1991; Quinn et al., 1991; Lan- 
yon, 1992, 1994; Helm-Bychowski and Cracraft, 1993; Kornegay et al., 1993; Kus- 
mierski et al., 1993; Avise et al., 1994a,b; Krajewski and Fetzner, 1994; Lanyon and 
Hall, 1994; Heidrich et al., 1995; Krajewski and King, 1995), although the 12S 
ribosome coding gene has been used also (Cooper et al., 1992; Cooper, 1994; Min- 
dell et al., 1996). 

Mitochondrial DNA is attractive for phylogenetic studies because of its conser- 
vative evolution with regard to gene order and, in its protein-coding genes, conser- 
vative amino acid replacement and occurrence of insertions and deletions (Brown, 
1985; Desjardins and Morais, 1990, 1991), contrasted with a high rate of synony- 
mous substitutions (Brown et al., 1982; Thomas and Beckenbach, 1989; Edwards 
et al., 1991; Irwin et al., 1991). Conservation of gene order and amino acid codons 
makes it easy to align sequences (establish homology) among species and to design 
PCR primers for a diversity of species (Kocher et al., 1989; Edwards et al., 1991), 
whereas the rapid rate of silent substitution increases the probability that the mole- 
cule will contain synapomorphies that reveal recent periods of shared ancestry. The 
latter point is particularly important to avian systematics because birds characteris- 
tically exhibit low levels of divergence in both nuclear and mitochondrial genes. 
Avian taxa appear to be "shifted down" approximately one taxonomic level relative 
to other vertebrate taxa (Kessler and Avise, 1985); e.g., divergences between species 
within a mammalian genus are approximately the same as those between species 
in distinct avian genera (Avise and Aquadro, 1982; Shields and Helm-Bychowski, 
1988). 

To refine and enhance methods of phylogenetic inference on the basis of geno- 
typic characters, it is important to distinguish the problem of resolving the gene 
tree from the problem of whether the gene tree, once resolved, is congruent with 
the species tree. Considering the gene-tree—species-tree problem first, the verte- 
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brate mitochondrial genome comprises 13 protein-coding genes, 22 tRNA genes, 
2 rRNA genes, and a control region but is inherited as a single linkage group exclu- 
sively through the female germ line. This means that only one independent gene 
tree can be estimated from the mitochondrial genome, no matter how many indi- 
vidual genes are sequenced. This is seemingly a severe limitation on the phylo- 
genetic utility of mtDNA because lineage sorting of ancestral polymorphisms may 
result in a mitochondrial haplotype tree that is incongruent with the species tree 
(Neigel and Avise, 1986; Nei, 1987; Pamilo and Nei, 1988; Wu, 1991; Hudson, 
1992; Moore, 1995). Thus, the true topology of the species tree could be further 
explored only by resolving trees from genes located in distinct linkage groups in the 
nuclear genome. For this reason, many phylogeneticists insist that a phylogeny can- 
not be considered resolved until multiple independent nuclear gene trees have been 
resolved and compared. However, Moore (1995) has shown that the mitochondrial 
haplotype tree has a much higher probability of congruence with the species tree 
than does a single nuclear gene tree and that a substantial number (16+) of inde- 
pendent nuclear gene trees would need to be resolved to be as confident of the 
species tree inferred from the mitochondrial haplotype tree. In fact, it is probable 
that a correctly resolved mtDNA haplotype tree will accurately reflect the species 
tree unless the internodes (times) between speciation events are short (Moore, 
1995). The relatively high probability of congruence of the mitochondrial haplo- 
type tree with the species tree is a consequence of the small effective size of the 
population with regard to the mitochondrial genome (Moore, 1995). Thus, the 
mitochondrial haplotype tree could be the most powerful tool for inferring species 
phylogenies for avian groups, provided that there is sufficient variation in mito- 
chondrial haplotypes among avian species to allow resolution of the haplotype tree. 


B. Objectives of This Study 


In this chapter, we focus on problems associated with resolving mt-haplotype trees 
from DNA sequence data. Specifically, we will assess the value of cyt b as a source 
of characters for inferring avian phylogenies and try to determine a window in the 
systematic hierarchy of birds where cyt b sequence should result in efficient resolu- 
tion of phylogenetic relationships. We focus on cyt b because it has been, by far, the 
most prevalent source of sequence data in avian studies but also because cyt b has 
developed a "bad reputation" as a source of characters for phylogenetic studies 
(Hillis and Huelsenbeck, 1992; Graybeal, 1993; Avise et al., 1994a,b; Meyer, 1994; 
Honeycutt et al., 1995), a reputation that seems contradicted by several avian stud- 
ies. Although it is apparent from these studies that cyt b fails to resolve some ancient 
nodes but succeeds in resolving more recent relationships, the range of levels in the 
systematic hierarchy where it does work is not clear. Moreover, there is a danger 
that a general extrapolation will be made that cyt b is a poor choice for all phylo- 
genetic studies—a danger of "throwing the baby out with the bath water." By 
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determining more accurately a range of conditions where cyt b does produce posi- 
tive results, we hope to preclude such a generalization. 

We will argue that cyt b not only works for resolving relatively recent evolution- 
ary history but may be the best choice for birds because of their tendency to have 
low rates of genic divergence at high taxonomic levels, compared to other verte- 
brate groups. Our analysis is based on comparisons among four cyt b sequence data 
sets: three from published phylogenetic studies (oscines [birds of paradise, and other 
passerines], Helm-Bychowski and Cracraft, 1993; cranes, Krajewski and Fetzner, 
1994, and Krajewski and King, 1995; barbets and toucans, Lanyon and Hall, 1994) 
and one previously unpublished set of sequences from woodpeckers. We will use 
the four data sets to examine a number of parameters that are useful for character- 
izing patterns and extent of nucleotide substitution between diverging sequences, 
and we will relate these parameters to circumstances where nodes are successfully 
and unsuccessfully resolved. Our goals are to better understand why cyt b works in 
some circumstances but not others, and to develop guidelines that would serve in a 
pilot study designed to determine whether cyt b would likely work well for resolv- 
ing the phylogeny of a particular group, or whether опе would be better off deriv- 
ing data from another gene. 


C. Properties of DNA Sequences That Are 
Ideal for Recovering Phylogenies 


Knowledge of what properties of DNA sequences should lead to recovery of the 
phylogeny would be useful for focusing attention on these properties in a pilot study 
and for bolstering the credibility (or incredibility) of empirical studies by identifying 
ranges of the parameter space where one should (or should not) be able to resolve 
phylogenetic relationships successfully. Unfortunately, these ideal properties are not 
completely known. One finds broad (but not complete) agreement that certain 
properties are sufficient to allow recovery of the true phylogeny by at least some of 
the available tree-building algorithms, but what conditions are necessary is much 
more contentious. Felsenstein (1988) provides an excellent review of phylogenetic 
inference based on DNA sequences. Springer and Krajewski (1989) reviewed phy- 
logenetic inference based on DNA-DNA hybridization. Although their review 
focuses on the validity of DNA-DNA hybridization methodology, they describe 
an "ideal model of genomic evolution," ideal for recovering the true phylogeny 
by distance methods. However, much of their discussion is germane to recover- 
ing phylogenies from DNA sequence data, particularly if one opts to use a dis- 
tance method. It is not our intent to review and evaluate these complex arguments. 
A distillation of the arguments is that recovering the true gene tree is likely if 
(1) nucleotide substitutions occur fast enough to "mark" periods of shared ancestry, 
but not so fast that these synapomorphies are obliterated by "multiple hits" after 
daughter species diverge from the ancestral species (Lanyon, 1988); (2) substitution 
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rates are the same or at least similar along diverging lineages (1.е., there is a molecular 
clock); (3) nucleotide composition (relative frequencies of G, A, T, and C) does not 
change bias during the evolution of the group (Irwin et al., 1991); and (4) there is 
ample sequence available (Saitou and Nei, 1986; Cracraft and Helm-Bychowski, 
1991; Nei, 1991). The amount of sequence required to resolve the gene tree de- 
pends on the actual structure of the true tree (Saitou and Nei, 1986; Lanyon, 1988; 
Nei, 1991); in general, when internodes are short, more data are required. 


H. MATERIALS AND METHODS 
A. Producing Sequence Data 


Protocols for amplifying by PCR and sequencing the mitochondrially encoded cyt b 
gene from birds are well established (Kocher et al., 1989; Meyer et al., 1990; Ed- 
wards et al., 1991; Krajewski and Fetzner, 1994; Lanyon and Hall, 1994). Several 
instances have been reported of unwitting amplification of sequences from con- 
taminant templates (see Derr et al., 1992; Helm-Bychowski and Cracraft, 1993; 
Hackett et al., 1995, for discussion). If precautions are not taken, this is easy to do 
because the PCR is capable of amplifying to high concentration a DNA sequence 
represented by a single initial copy, and even reassembling degraded templates into 
whole templates, which are then amplified (Paabo et al., 1990). The woodpecker 
sequences reported here were obtained following procedures modified from Ko- 
cher et al. (1989), Meyer et al. (1990), and Edwards et al. (1991). AIl ofthe modifi- 
cations were additions to the protocols intended to reduce the probability of am- 
plification from a contaminant template. These include (1) use of aerosol-resistant 
pipette tips (ART; Promega, Madison, WI) in all procedures involving DNA isola- 
tion and PCR; (2) exposure of DNA isolation reaction mixes to short-wave ultra- 
violet (UV) light (exposure is 30 min at 7.2 joules/m?/sec, wavelength — 254 nm) 
before adding the tissue; and (3) a similar exposure of PCR reaction mixes to short- 
wave UV light before addition of the template and Taq polymerase (Cimino et al., 
1990). Finally, we sequenced two specimens for each species. This gives consider- 
able assurance that the sequences are indeed the target species if the replicate se- 
quences are nearly identical or more similar to each other than to other sequences. 
Sequencing replicate specimens is expensive, but it is common practice to sequence 
both the light and heavy strands from a single specimen. However, for a phyloge- 
netic study, sequencing one strand from each of two specimens, and accepting a low 
level of unresolved base pairs (bp), may be a better investment than sequencing both 
strands from a single specimen (see below). 

The woodpecker sequences reported here are for the light strand. We amplified 
and sequenced 1047 bp of the 1143-bp cyt b gene as two overlapping fragments that 
were sequenced using three of the four end primers from the PCR reactions and 
one additional internal primer. (In the following primer descriptions, the published 
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name of the primer is italicized and the position of the terminal base at the 3’ end 
is given relative to the chicken mitochondrial light strand numbering convention; 
Desjardins and Morais, 1990). The 5' fragment (705 bp, excluding primers) was 
amplified with primers L14841 (Kocher et al., 1989; 3' — CL14990) and H15547 
(Edwards et al., 1991; 3’ = CL15696) and the 3’ fragment (755 bp) with primers 
CBL15311 (A. Meyer, personal communication; 57 GCAAGCTTCTACCAT- 
САССАСАААТАТС 3', 3” = CL15311) and H16065 (Helm-Bychowski and 
Cracraft, 1993; 3” = CL16065 in tRNA”). Sequencing primers were L14841 for 
the 5' fragment and CBL15311, H16065, and an internal primer, L15424 (Ed- 
wards et al., 1991, 3' — CL15569), for the 3' fragment. Two specimens were se- 
quenced for each species, except Melanerpes carolinus, and the authenticity of the 
sequences established by comparing conspecific sequences. We have amplified a 
second M. carolinus specimen, but the sequence is not complete. We have established 
that the partial sequences for both fragments of the second specimen are nearly 
identical to the first. The specimens are tabulated in Appendix I along with voucher 
numbers and locale data. 

Sequencing was done on an Applied Biosystems (Foster City, CA) automated 
sequencer (Dye Deoxy Terminator Cycle sequencing kit) in the Wayne State Uni- 
versity Center for Molecular Medicine and Genetics Core DNA Sequencing Fa- 
cility (Detroit, MI). 


II. RESULTS 
A. Sequences 


The 11 woodpecker cyt b sequences span 1047 bp or 349 codons (Appendix II). 
Sequences were not obtained for 96 bp from the 5’ end of the gene. We obtained 
only 946 bp for the two Venilornis nigriceps species (GenBank Accession Nos. 


U83282—U83302). 


B. Phylogenetic Analysis 


It is not our goal to reanalyze the three published data sets or to analyze the wood- 
pecker sequences in exhaustive detail, but rather to examine patterns of nucleotide 
substitution, over the evolutionary history of these groups, that would provide in- 
sight as to when cyt b is useful for inferring evolutionary history. It is more im- 
portant to our analyses to have trees that are comparable rather than to have the 
tree that is judged best by some criterion or consensus of criteria. Thus, the four 
phylogenies in Fig. 4.1a—d were estimated in the same “standard” way; they are 
neighbor-joining trees based on the Tamura—Nei estimator of nucleotide diver- 
gence, all three codon positions, and both transitions and transversions (Tamura and 
Nei, 1993; Kumar et al., 1993). The Tamura—Nei estimator corrects for secondary 
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substitutions (multiple hits); differences in substitution rates between transitions and 
transversions, differences in substitution rates between particular classes of transi- 
tions (A to G vs C to T) and between particular classes of transversions, and for 
composition bias. All of these are significant in the divergence of cyt b among avian 
lineages. 

To show the relative antiquity of the taxa represented in the data sets, trees in 
Fig. 4.1a—d were drawn to the same scale and adjusted to a common time reference 
(dashed vertical line), assuming a molecular clock. This was accomplished by root- 
ing each tree with a specified outgroup and calculating the average distance to the 
branch tips relative to the shortest branch for the ingroup. Each phylogeny was then 
positioned with this average on the common time line, which represents the pres- 
ent. The branch lengths are given above the branches. It is apparent that there is 
substantial variation in the antiquity of the clades, and overall the four phylogenies 
provide a good representation of bifurcations ranging from recent to ancient. 

We used bootstrap proportions (P) in some analyses as a relative measure of the 
support for nodes in phylogenies (Felsenstein, 1985). Work has shown that P is 
biased such that it actually underestimates the true probability ofa node in circum- 
stances where there is strong support for the node in the data, but overestimates the 
true probability when the phylogenetic signal for the node is weak (Hillis and Bull, 
1993; Felsenstein and Kishino, 1993). Thus, given that P is a biased estimator, it is 
biased in the direction one would hope in that it leads to conservative assertions 
that a node is strongly supported by the data. On the basis of simulations under a 
wide range of conditions, Hillis and Bull (1993) found that nodes inferred by boot- 
strap proportions of 70% or more actually occurred in 95% or more of the simulated 
phylogenies. In other words, P — 7096 corresponded roughly to a 9596 probability 
that the node was real. 


1. Oscines 


This data set (Helm-Bychowski and Cracraft, 1993) comprises 1143 bp or 381 co- 
dons, which is the entire cyt b gene, from 11 species. Although the phylogenetic 
objectives of Helm-Bychowski and Cracraft were to evaluate relationships among 
the birds of paradise, and their relationship to bowerbirds, the authors included 
several more distantly related passerines to serve as outgroup taxa. Empidonax mini- 
mus, a suboscine, was designated as the outgroup for purposes of rooting, whereas 
the ingroup comprises oscines. Overall, this is the most weakly supported of the 
four cyt b trees analyzed in this study. Helm-Bychowski and Cracraft performed 
parsimony analyses including both transitions and transversions (global parsimony) 
and transversions only. Our neighbor-joining tree (Fig. 4.1c) is based on the same 
data as the global parsimony tree of Helm-Bychowski and Cracraft. Their two par- 
simony trees differ in substantial ways and our neighbor-joining tree has various 
nodes in common, and in difference, with both. However, all three are identical 
with regard to the only two nodes that are strongly supported by bootstrap propor- 
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tions: clustering of the three paradisaeinine birds of paradise (Epimachus fastuosus, 
Ptiloris paradiseus, and Diphyllodes magnificus), to the exclusion of the one manuco- 
dine bird of paradise (Manucodia keraudrenii), and the clustering of the two bower 
birds (Ptilorhynchus violaceus and Ailuroedus melanotus). The other nodes appear to be 
too short and/or too deep in evolutionary history to expect resolution by cyt 6 
sequence (see below). 


2. Barbets and Toucans 


This data set (Lanyon and Hall, 1994) comprises 888 bp or 296 codons from seven 
species; a woodpecker, Sphyrapicus varius, was specified as the outgroup. Sequence 
was not obtained for 123 bp from the 5’ end of the gene and 132 bp from the 3’ 
end. Lanyon and Hall performed global and transversion-only parsimony analyses, 
each yielding the same topology. Our neighbor-joining tree (Fig. 4.1a) has a nearly 
identical topology to their most parsimonious tree (Fig. 3A in Lanyon and Hall, 
1994), differing only in that the parsimony tree indicated that the two toucan spe- 
cies (Ramphastos tucanus and Aulacorhynchus derbianus) did not form a monophyletic 
group, whereas they do in the neighbor-joining tree. The placement of the toucans 
is weakly supported in both analyses. In the parsimony analysis, Lanyon and Hall 
(1994) noted that the toucans did join in a monophyletic group when the Old 
World barbets (Pogoniulus bilineatus and Lybius bidentatus) were designated as the 
outgroup, rather than the sapsucker (Sphyrapicus varius). In the neighbor-joining 
analysis, the branch uniting the toucans was supported by 58% of 500 bootstrap 
replicates. The parsimony and neighbor-joining trees are identical in supporting the 
monophyly of the Old World barbets, albeit weakly (54% of 500 bootstraps in the 
neighbor-joining tree) and strongly supporting the inference that the New World 
barbets (Capito) form a clade with the toucans and not the Old World barbets (100% 
of 500 bootstrap replicates support the toucan—New World barbet clade). 


3. Cranes 


This data set (Krajewski and King, 1995) comprises 1137 bp from 20 species or 
subspecies; 2 bp are undetermined at the 5’ end of the gene and 4 bp are undeter- 
mined at the 3’ end. (Subsequent to providing us with the sequences, Krajewski 
and King resolved all ambiguous bases, and increased the length of each sequence 
to 1143 bp for publication in their paper). Krajewski and King carried out sev- 
eral distance analyses involving permutations of distance estimators (Kimura two- 
parameter, Tamura—Nei, maximum likelihood), categories of variation (transitions 
plus transversions, transversions only, synonymous, nonsynonymous), and tree al- 
gorithms (least squares, neighbor-joining, maximum likelihood); they also did sev- 
eral parsimony analyses with a variety of weighting schemes. All of the distance 
analyses produced the same topology, which is presented in Fig. 4.1b. Krajewski 
and King’s parsimony analysis, based on informative sites and equal weighting of all 
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characters, produced a single most parsimonious tree, which differed from the dis- 
tance tree topology in only one respect: joining the Americana species group (Grus 
japonensis, G. americana, G. grus, G. monachus, and G. nigrocollis) with the Anthro- 
poides (Anthropoides + Bugeranus)— G. canadensis clade rather than the Antigone (G. 
antigone, С. rubicundus, С. vipio)— С. leucogeranus clade, as in the distance tree. How- 
ever, this disparity is not statistically significant because the nodes in question are 
weakly supported by bootstrap replicates, 27% in the case of the Anügone- С. leu- 
cogeranus clade joined with the Americana species group (Fig 1b). 


4. Woodpeckers 


The neighbor-joining tree is presented in Fig. 4.1d. A branch-and-bound search 
found a single most parsimonious tree, which had a topology identical to that of the 
neighbor-joining tree. The woodpeckers comprise one of three subfamilies within 
the Picidae, the Picinae. The piculets (Picumnus aurifrons, Fig. 4.1d) comprise a sec- 
ond subfamily, which is thought to be the sister group of the woodpeckers (Short, 
1982). We specified P. aurifrons as the outgroup to root the tree. 

The cyt b tree is at odds in some respects with classic views on woodpecker 
relationships (Short, 1982). The cyt b tree strongly supports a sister group relation- 
ship between the pileated woodpecker (Dryocopus pileatus) and a clade comprising 
the genera Colaptes (flickers) and Piculus, strongly supports a sister group relationship 
between the genera Picoides and Veniliornis, and weakly supports a sister group rela- 
tionship between sapsuckers and melanerpines. All of these relationships have been 
contentious historically, with at least someone advocating the relationships apparent 
in the cyt b tree at sometime (Bock and Miller, 1959; Short and Morony, 1970). 
These relationships are also strongly supported by a comparable analysis of 1512 bp 
of the mitochondrial COI gene, sequenced for the same set of specimens (except 
Colaptes rupicola). The relationship between sapsuckers and melanerpines was sup- 
ported by the COT data in 96% of 500 bootstrap replicates (DeFilippis, 1995). The 
placement of Campephilus is less certain, although it does appear to have diverged 
early in the radiation of the woodpeckers. Finally, the moderately strong support 
for the pairing of Colaptes rupicola with Piculus rubiginosus rather than Colaptes auratus 
is not implausible. It is recognized that these two genera are very closely related 
(Short, 1982), and preliminary analyses, based on cyt b, involving more species of 
Colaptes and Piculus always result in a paraphyletic tangle of species (Moore, 1995 
and unpublished). 


C. Base Composition 
Base composition for the three positions within codons is summarized in Table I. 


We point out two properties that impinge on the phylogenetic signal accumulated 
and retained by evolving cyt b sequences: (1) there is strong bias relative to a uni- 


TABLEI Average Cytochrome b Base Composition for Avian Families? 





Family 
Galliformes 
Phasianidae 


Piciformes 
Picidae 


Lybiidae 


Ramphastidae 


Gruiformes 
Gruidae 


Passeriformes 
Paradisaeidae 


Laniidae 
Corvidae 
Vireonidae 
Muscicapidae 


Tyrannidae 


No. of 
species 


11 


20 


1 
1 














Al ТІ C1 G1 A2 T2 C2 G2 A3 T3 C3 G3 
26.8 22.8 29.7 20.7 20.7 39.1 27.8 12.3 34.7 10.5 51.6 3.2 
24.13 23.53 29.38 22.95 19.23 41.17 25.67 13.91 29.83 13.07 53.10 4.00 
ЖІЛІ 4045 +073 +0.84 +0.37 +060 +067 +046 +244 +253 +374 +1.16 
24.35 24.35 29.45 21.80 19.60 39.55 28.05 12.80 36.40 12.95 49.00 1.70 
+0.78 +064 +021 +042 +000 +049 +0.49 +£0.00 +1.27 +£3.89 +071 +1.98 
24.05 24.48 28.85 22.60 19.25 39.78 27.90 13.05 28.85 17.80 49.65 3.65 
+0.42 +071 +071 +049 +042 +0.07 +£0.92 +0.64 +2.97 +078 +049 21,70 
26.39 22.21 30.46 21.00 20.35 39.59 27.02 13.08 38.00 12.73 46.55 2.71 
+0.40 +061 +£0.60 0.0.3 +0.14 +£0.28 +031 +010 +1.01 +149 11.36 +£0.79 
26.33 21.83 29.40 22.45 20.60 41.00 25.40 13.00 42.65 14.50 40.10 2.75 
+0.57 2087 +1.49 +0.85 +012 +047 +047 40.12 +£1.09 +0.87 50.57 +0.44 
26.8 22.3 28.6 22.3 20.7 41.2 24.7 13.4 40.2 17.6 38.3 3.9 
24.7 23.4 28.9 23.1 20.5 41.5 25.2 12.9 40.2 15.5 40.4 3.9 
25.5 20.2 30.7 23.6 20.5 40.4 26.0 13.1 45.6 9.8 42.7 1.8 
25.7 21.3 29.7 23.4 21.0 40.4 26.0 12.6 37.8 12.3 46.2 3.7 
26.3 23.9 28.7 21.1 20.0 40.5 26.6 12.9 37.4 15.3 44.2 3.2 





" Average percentage of each nucleotide [adenine (A), thymine (Т), cytosine (C), and guanine (G)], +SD is given for each of the three positions 
within codons. Compositions were averaged over all species within families (number of species). The Phasianidae composition is for the chicken 
(Desjardins and Morais, 1990). 
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form distribution of 25% for each base; the bias varies markedly among positions 
and is most pronounced at position 3, where guanine comprises at most 4.096; 
(2) variation in the pattern of bias is not great within or between families, with the 
exception of A and C at third positions, which vary by as much as 13% between 
families comprising Piciformes and those comprising Passeriformes. 


D. Pairwise Comparisons among 
Divergent Sequences 


Sequence comparison is a powerful tool for understanding processes and patterns of 
nucleotide substitution and how these might impinge on phylogenetic inference. 
For example, pairwise comparisons have provided insight on variation in substitu- 
tion rates among nucleotide sites and among lineages and on conservation of gene 
regions. However, these comparisons are fraught with statistical problems, the most 
fundamental of which is stochastic dependence among observations. It is common 
to include data from all pairwise comparisons among the sequences that are available 
to a study, ignoring the fact that they have shared common ancestry to varying 
degrees. The problem with this is illustrated in Fig. 4.2. If, for example, a relative 
rates test were made comparing lineages A and C (AC) with an outgroup, O, one 
could test for equality of rate substitution along the lineages from n, to species A 
and C. But the distance OB is not independent of (is correlated with) OC because 
of the period of shared ancestry from n, to из; if by chance an improbably large 
number of substitutions occurred along ииз, then two relative rates tests would 
appear significant owing to the inclusion of the same internode. The effect of not 
considering stochastic dependence when making all pairwise comparisons is to 


A B С O 





FIGURE 4.2 Stochastic dependence resulting from shared ancestry. A, B, C, and О represent con- 
temporary sequences; пу, n2, and n, represent ancestral nodes. 
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make the inference, whether right or wrong, appear more certain than it really is. 
This problem applies to all inferences based on pairwise comparisons, be they infer- 
ences about substitution rates, transition—transversion ratios, or other parameters. 

To avoid this problem we randomly selected a subset of species for pairwise com- 
parisons from each of the four phylogenies, but with the additional constraint that 
the evolutionary pathway represented by each pair is independent of those of other 
pairs, or nearly so. This procedure avoids to a considerable extent the problem of 
stochastic dependence but has the disadvantage that some information is lost to the 
analysis. However, it is a conservative procedure in that it will tend to err on the 
side of not finding a significant difference when, in fact, there is one, as opposed to 
inferring a difference that is not really there. This procedure also presupposes that 
the phylogeny is correct, which is of course unknown. However, this procedure 
should be reasonably safe if the phylogeny is based on a data set with strong phylo- 
genetic signal, because long internodes representing strong stochastic dependence 
should be relatively clearly resolved, and internodes that are not clearly resolved will 
tend to be short and hence contribute little to dependence among the observations. 

'The 24 independent pairwise comparisons are tabulated in Table II (pairs 1—24) 
along with corresponding estimates of some parameters that are useful in identifying 
the range of the taxonomic hierarchy where cyt b has a reasonably high probability 
of resolving the phylogeny. The pairs are ranked according to level of taxonomic 
divergence. Comparisons 23 and 24 (Colaptes auratus Х Belearica regulorum and 
Empidonax minimus Х Gallus domesticus) represent comparisons between data sets. 
These comparisons between remotely related taxa were made to determine the 
relationship of transitions to transversions at, or near, saturation. The C. auratus 
sequence was used in two "independent" comparisons, intratribal (pair 10) and 
infraclass (pair 23). Dependence resulting from duplicate use of this sequence should 
be negligible because the crane (B. regulorum) is so distantly related. There are six 
additional comparisons listed at the bottom of Table II (pairs 25—30). These are 
statistically dependent to varying degrees among themselves and with the 24 inde- 
pendent comparisons but represent pairs of taxa for which levels of divergence have 
been measured by DNA-DNA hybridization (Sibley and Ahlquist, 1990); the last 
one of these also involves a comparison between data sets. These six sequences were 
used in one analysis in which we compared levels of cyt b divergence with levels of 
nuclear genome divergence based on DNA-DNA hybridization. 

The numbers of mismatches in the pairwise comparisons, falling in the various 
classes of substitutions (transitions, transversions, synonymous, and nonsynony- 
mous), were converted to substitutions per nucleotide pair by dividing each count 
by the number of base pairs in the aligned sequences. It is apparent with a glance at 
Table II that divergence primarily entails synonymous transitions, although the rela- 
tive rates at which various classes of substitutions accrue are functions of the levels 
of divergence. These relationships are discussed below because they are more 
readily seen in the graphical representations. 
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TABLE II Statistics for Cytochrome b Sequences Compared between Avian Species“ 





Pair 


. Grus antigone antigone X С. a. sharpei 
. Capito dayi X C. niger 


. Grus antigone gillae Х С. leucogeranus 


Grus vipio Х С. rubicunda 
Grus japonensis X G. canadensis pratensis 


Grus americana X G. grus 


. Grus monachus X G. nigricollis 

. Piculus rubiginosus X Colaptes rupicola 

. Epimachus fastuosus X Ptiloris paradiseus 

. Colaptes auratus X Dryocopus pileatus 

. Veniliornis callonotus X Picoides villosus 

. Anthropoides virgo X Bugeranus carunculatus 
. Sphyrapicus varius X Melanerpes carolinus 


. Dphyllodes magnificus X Cyanocitta cristata 


Taxon 


Species 
Genus 
Genus 
Genus 
Genus 
Genus 
Genus 
Genus 
Tribe 
Tribe 
Tribe 
Tribe 
Subfamily 
Subfamily 


bp 


1134 
883 

1135 
1135 
1135 
1135 
1135 
1029 
1143 
1015 
1027 
1135 
1020 
1143 


d 


0.0116 
0.1522 
0.0558 
0.0327 
0.0763 
0.0317 
0.0161 
0.0712 
0.1136 
0.1368 
0.1034 
0.0597 
0.1593 
0.1741 


Ts/site 


0.005 
0.120 
0.043 
0.025 
0.064 
0.023 
0.013 
0.062 
0.089 
0.101 
0.083 
0.050 
0.102 
0.101 


Tv/site 


0.006 
0.009 
0.010 
0.007 
0.006 
0.008 
0.003 
0.004 
0.012 
0.020 
0.010 
0.006 
0.037 
0.050 


Ts-to-Tv Syn. sub. / 


ratio 


0.86 
13.25 
4.45 
3.50 
10.43 
2.89 
5.00 
16.00 
7.29 
5.15 
8.50 
8.14 
2.74 
2.02 


site 


0.008 
0.118 
0.043 
0.025 
0.061 
0.026 
0.011 
0.058 
0.086 
0.104 
0.079 
0.046 
0.111 
0.122 


Nonsyn. 
subs. / 
site 
0.004 
0.011 
0.010 
0.007 
0.010 
0.004 
0.004 
0.005 
0.016 
0.011 
0.010 
0.010 
0.018 
0.028 


АТН 


R 


5.05 
0.02 
0.03 
1.14 
1.88 
0.61 
1.14 
0.24 
0.04 
1.74 
0.63 
0.48 
0.23 
1.22 


L6 


15. Ramphastos tucanus X Aulacorhynchus derbianus Subfamily 878 0.1437 0.083 0.046 1.83 0.091 0.037 4.2 0.50 


16. Campephilus haematogaster X Picumnus aurifrons Family 966 0.1886 0.105 0.059 1.77 0.101 0.046 — = 
17. Ailuroedus melanotus X Ptilorhynchus violaceus Family 1143 0.1706 0.088 0.062 1.42 0.118 0.033 5.0 0.00 
18. Lybius bidentatus X Pogoniulus bilineatus Family 880 0.1589 0.076 0.065 1.18 0.123 0.018 = 0.27 
19. Balearica pavonina X Anthropoides paradisea Family 1135 0.1086 0.083 0.015 5.53 0.087 0.011 3.3 = 
20. Manucodia keraudrenii X Lanius ludovicianus Superfamily 1143 0.1788 0.097 0.058 1.68 0.123 0.031 9.1 1.64 
21. Vireo olivaceus Х Catharus guttatus Suborder 1141 0.1723 0.074 0.079 0.93 0.120 0.033 12.8 0.25 
22. Sphyrapicus varius X Capito dayi Order 873 0.2541 0.111 0.101 1.10 0.161 0.044 16.5 — 
23. Colaptes auratus X Balearica regulorum Infraclass 1019 0.2108 0.094 0.087 1.08 0.127 0.045 26.3 = 
24. Empidonax minimus X Gallus domesticus Subclass 1140 0.2371 0.089 0.111 0.80 0.151 0.050 28.0 — 
25. Melanerpes carolinus X Picoides villosus Subfamily 1035 0.1625 0.109 0.032 3.42 0.118 0.019 5.0 = 
26. Colaptes auratus X Picoides villosus Subfamily 1020 0.1361 0.090 0.031 2.88 0.095 0.019 5.8 = 
27. Lybius bidentatus X Capito dayi Infraorder 887 0.2431 0.112 0.092 1.21 0.161 0.043 11.5 = 
28. Ramphatos tucanus X Capito niger Family 877 0.1775 0.107 0.047 2.29 0.119 0.034 6.1 — 
29. Empidonax minimus X Cyanocitta cristata Order 1139 0.2419 0.107 0.097 1.11 0.152 0.051 19.7 = 
30. Empidonax minimus X Balearica regulorum Superorder 1137 0.2208 0.094 0.094 1.00 0.142 0.046 21.6 m 





“Тһе first 24 comparisons are the independent comparisons: bp, base pairs; 4, Tamura—Nei distance; Ts, transitions; Tv, transversions; Syn. sub., synonymous substi- 
tutions; Nonsyn. sub., nonsynonymous substitutions; R, dispersion index (see text for details). The G. domesticus sequence is from Desjardins and Morais (1990). 
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E. The Relationship between Transitions 
and Transversions 


The ratio of transition to transversion substitutions is important for two reasons: 
(1) it is useful for identifying levels of divergence at which nucleotide sites are be- 
coming saturated by multiple transitions (multiple hits), and (2) it must be known 
and corrected for if phylogenetic algorithms are to make accurate estimates of the 
branch lengths. 

The ratio of transition to transversion substitutions is highest between sequences 
that are just beginning to diverge and decreases as time of divergence increases 
(Brown et al., 1982). Holmquist (1983) provided a theoretical understanding of this 
phenomenon. Although the "instantaneous" rate of transition substitutions is al- 
ways higher than that of transversions, the apparent rate at which diverging se- 
quences accrue differences changes through time, depending also on the probabili- 
ties of the 12 possible character state changes (C T, СА, etc.) and the base 
composition of the sequences. For pairwise comparisons of diverging sequences, 
plots of transitions as a function of transversions have an initial phase where there is 
a rapid and nearly linear increase in transitions, but the curve plateaus as a significant 
number of sites sustain multiple transitions; at this point, sites have a much lower 
probability of sustaining multiple transversions and so the number of transversional 
differences between the sequences continues to increase while the number of tran- 
sitional differences levels off (Fig. 4.3). Eventually, the transition-to-transversion 
ratio reaches an equilibrium value of 0.5 if G, A, T, and С are equally frequent (25% 
each), but when base composition is biased as in the case of the cyt b gene, it will 
equilibrate at some other value. 

The empirical distribution of transitions as a function of transversions is pre- 
sented in Fig. 4.3 for the 24 independent pairwise comparisons compiled in 
Table II. We have plotted distinctive symbols representing comparisons among taxa 
at various levels in the taxonomic hierarchy so that a relationship between saturation 
of transitions and taxonomic level might be apparent. Each taxonomic level indi- 
cated in Fig. 4.3 is the lowest level that encompasses both taxa in the comparison; 
e.g., a symbol indicating tríbe involves a comparison between two species that be- 
long to distinct genera within that tribe. Taxonomic level, of course, provides only 
a crude index of divergence because it varies somewhat arbitrarily among major 
groups. For example, the cranes Balearica pavonina X Anthropoides paradisea [Fig. 4.3, 
open triangle (0.015, 0.083)] are assigned to separate subfamilies (Krajewski and 
Fetzner, 1994), but have a level of divergence and a transition-to-transversion ratio 
more characteristic of distinct genera or tribes within a single subfamily. 

The relationship between transitions and transversions is roughly linear through 
the level of tribes, at which point it plateaus rather quickly. This indicates that avian 
radiations retain phylogenetic information in the form of transitional substitutions 
at least to the tribal level (relationships among genera within tribes). Above the 
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FIGURE 4.3 Transition substitutions as a function of transversions. The points represent the first 24 
independent pairwise comparisons tabulated in Table II. 


tribal level, transitions may still be informative, but multiple hits have significantly 
eroded the phylogenetic signal or, stated another way, homoplasies and autapomor- 
phies are replacing synapomorphies. Identifying an upper limit where transitions are 
so saturated that they no longer retain phylogenetic information is more complex 
because it depends on the lengths of internal nodes in the phylogeny relative to 
more distal branches representing descendant lineages (see below). Transversions 
should continue to be informative well beyond the tribal level but, again, identify- 
ing the upper level at which synapomorphies in the form of transversions have been 
significantly eroded is a more complex problem that cannot be completely resolved 
by examining the transition-to-transversion ratio. 

Accurate and precise estimation of the transition-to-transversion ratio is also 
fraught with statistical problems (Holmquist, 1983). Investigators commonly aver- 
age the ratios for all pairwise comparisons of sequences in their data. This is in- 
correct. First, there is the fact that all pairwise observations are not independent. 
Second, and more important, distance estimates that correct for bias in the transi- 
tion-to-transversion ratio require the ratio of instantaneous rates at which transition 
and transversion substitutions occur (Kimura, 1980), which is different from the 
ratio of transitions and transversions that have accumulated over a long period of 
time. A better way is to make pairwise comparisons among recently diverged se- 
quences so that the problem of multiple hits is mitigated, although not eliminated. 
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The major problem with an estimate based on recently evolved sequences is that 
the numbers of observed substitutions, especially transversions, will be small and as 
a consequence the sampling variance of the estimate will be high (Holmquist, 
1983). 

We estimated the transition-to-transversion ratio for avian species by fitting a 
least-squares regression line, forced through the origin, to the part of the transition— 
transversion curve that is roughly linear. The slope of the regression line is an esti- 
mate of the ratio. We included in the estimate 13 data points where the proportion 
of transversion sites was less than 0.02. This includes the comparison of Colaptes 
auratus and Dryocopus pileatus [Fig. 4.3, open square (0.02, 0.10)] and all data points 
to the left. The estimated transition-to-transversion ratio is 6.25. The total numbers 
of transitions and transversions over these 13 comparisons are 808 and 125, respec- 
tively. We repeated the analysis with the comparison of Capito niger with C. dayi 
excluded. This is the solid circle in the upper left corner of Fig. 4.3, which appears 
to be an outlier. The ratio without the C. riger- C. dayi comparison is 5.77. The 
number of substitutions separating C. niger and C. dayi is extraordinary, but more 
worrisome is that its transition-to-transversion ratio lies well away from the other 
data, although, as mentioned above, the sampling variance in this part of the curve 
is high. We estimated the transition-to-transversion ratio by an alternative method. 
Since we had sequences for two individuals of each species of woodpecker, we 
summed the transitions and transversions for all intraspecific comparisons. These 
comparisons are independent and the level of divergence is low, so that multiple 
hits are minimal. The total numbers of transitions and transversions are 66:13 for a 
ratio of 5.08. As noted above, the variance of this estimate is large. 


F. The Relationship between Taxonomic 
Divergence and Saturation of Transversions 


The logic of plotting transitions as a function of transversions is that the relatively 
low rate of transversion substitutions provides a more or less linear scale over the 
time span when transitions become saturated, thus revealing the level in the taxo- 
nomic hierarchy wbere saturation occurs. A plausible approach to the remaining 
problem of determining the level at which transversions begin to saturate is to plot 
transversions as a function of a thermal dissociation parameter such as AT;,H from 
DNA-DNA hybridization studies. Thermal dissociation values are available for 
many pairs of avian species because of the extensive work of Sibley and Ahlquist 
(1990) and others (e.g., Sheldon, 1987; Bledsoe, 1987; Krajewski, 1989; Sheldon 
et al., 1992; Bleiweiss, 1994). AT.,H and related thermal dissociation parameters 
measure the amount of sequence divergence in single-copy nuclear DNA, which 
accrues at a much slower rate than in the mitochondrial genome; thus, A T;,H could 
provide a relatively linear reference scale for identifying saturation of transversions 
in mitochondrially encoded cyt b. We found eight pairs of species among our set of 
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FIGURE 4.4  Transversion substitutions and amino acid substitutions in mitochondrial cyt b as func- 
tions of divergence in single-copy nuclear DNA as measured by АТ,,Н from DNA-DNA hybridization 
studies (Delta, horizontal axis). Solid symbols represent transversions; open symbols represent amino 
acids. Taxonomic levels: circles, subfamilies—families, squares, superfamilies—orders, triangles, super- 
orders—subclasses. The vertical axis is in units of transversion substitutions/nucleotide site or amino acid 


substitutions/codon. 


independent comparisons of cyt b sequences (АТН values listed in Table IT) plus 
six additional pairs (Table II, pairs 25-20) among the complete sets of сус b se- 
quences for which AT;,H values could be determined from Sibley and Ahlquist 
(1990) for the same species or for closely related species. 

Transversion substitutions in cyt b as a function of AT;,H are plotted in Fig. 4.4 
for the pairwise comparisons. It is evident that the number of transversion substi- 
tutions between diverging sequences is fairly linear up to approximately AT;,H = 
12.5—15. Again, we distinguished taxonomic levels with symbols. This indicates 
that the relationship is linear through most of the comparisons in the superfamily— 
order grouping; in fact, the first point to fall markedly below the line is the right- 
most square, which compares a suboscine (Empidonax minimus) with an oscine 
(Cyanocitta cristata). This analysis is useful for identifying levels in the taxonomic 
hierarchy through which the phylogenetic information contained by transversions 
has not been severely eroded, but it does not identify the upper limit where so many 
multiple transversions have occurred that the phylogenetic signal is, for practical 
purposes, lost. 
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G. The Relationship between Taxonomic 
Divergence and Amino Acid Substitutions 


Amino acid substitutions as a function of A T;,H are also plotted in Fig. 4.4 for the 
pairwise comparisons. Amino acid substitutions appear to accumulate as rapidly as 
transversions for a short initial period, but then the rate slows and the trajectory 
remains low over much of the evolution of birds. 


H. Short Internodes, Long Terminal 
Branches, and Phylogenetic Resolution 


Transitions viewed as a function of transversions, and transversions, in turn, viewed 
as a function of AT;,H indicate that сус b retains phylogenetic signal up to at least 
the level of avian families and superfamilies within orders, but this does not tell 
the complete story. Whether the branching order of three OTUs can actually be 
resolved depends on the relative lengths of the internode (Т) and the terminal 
branches (Т); see Fig. 4.5а): When the internode is long and the terminal branches 
short, the probability of resolving the correct branching order is high because a 
relatively large number of synapomorphies are expected to have accrued during 
common ancestry and relatively little time has elapsed for subsequent substitutions 
to obliterate these synapomorphies. 


a 


-—3 





FIGURE 4.5 Resolution of phylogenetic trichotomies as a function of internode length (75) and ter- 
minal branch length (Т)). (a) A hypothetical trichotomy showing the relationships between taxa, Т, and 
T. (b) Plot of bootstrap proportions supporting a trichotomy as a function of Т, and Т. The relative 
level of resolution is measured by the percentage of 500 bootstrap replicates that supported the tri- 
chotomy, based on the neighbor-joining tree (Tamura- Nei distance, including both transitions and 
transversions). Distinctly shaded circles represent nodes at various levels in the taxonomic hierarchy: solid 
circles, species~genera; cross-hatched circles, tribes-subfamilies; empty circles, families-suborders 
(where taxonomic level is the lowest level that encompasses all taxa in the trichotomy). 


4 Taxonomic Resolution Based on Cytochrome b DNA 103 


The importance of the relationship between the lengths of terminal branches 
and internodes was developed in a theoretical context by Lanyon (1988). Fig- 
ure 4.5b is a three-dimensional plot similar to those depicted by Lanyon, in which 
he plotted the probabilities of resolving a trichotomy as a function of T; and T;. In 
our application we have plotted the percentages of 500 bootstrap replicates, as a 
function of T, and Т,, that support each internode in each of the four phylogenies 
(Fig. 4.1a—d). For each internode, T; is the branch length indicated in the phy- 
logeny, and Т) is the average length of branches distal to the internode, averaged by 
clades, moving inward from the branch tips. As an example, in the woodpecker 
phylogeny (Fig. 4.14) there is a group of four species that includes Dryocopus pileatus, 
two species of Colaptes, and one species of Piculus: the internode (T>) has length 
0.017, and was supported by 98% of bootstrap replicates; Т, =[(0.037 + 0.033)/2 
+ 0.003 + 0.041]/2 = 0.0395. This data point can be located from the coordinates 
in Fig. 4.5b. It is among the cluster of five cross-hatched (tribe-subfamily) data 
points near the top—center of the space with bootstrap support near 100%; specifi- 
cally, it is the leftmost point in this cluster. 

Figure 4.5b is helpful in understanding circumstances where cyt b will likely 
resolve a given trichotomy and where it will not. First, following changes along the 
T; axis from back (0.000) to front (0.100) it can be seen that lower taxonomic levels 
(solid dots) are in the back cross-sections and the higher levels toward the front, as 
expected, because lower levels represent recent splits (short T;) and higher levels 
represent more ancient splits (long Ti). Now, following changes along the T; axis, 
the profile for bootstrap support as a function of T, at a fixed value of T, is generally 
one where bootstrap values аге low for low values of T;, but increase as Т, increases. 
However, bootstrap support increases more rapidly as a function of T, for low values 
of Т, than high values of Ti; i.e., the longer the period of evolution along the 
lineages after they split from the internode, the more eroded is the information in 
the antecedent internode. Now, exploring the question of what taxonomic levels 
one can expect to resolve, or not resolve, it is apparent that there is no set cutoff 
point; at all levels, species—genera, tribes—subfamilies, families—suborders, some 
nodes were resolved (supported by high bootstrap values) whereas others were not. 
However, some generalizations can be drawn. 

In making these generalizations, we use 70% bootstrap proportions (500 repli- 
cates) as a cutoff point in classifying nodes as resolved or unresolved, but acknowl- 
edge that this is somewhat arbitrary in that the parameters guiding the evolution of 
the avian cyt b gene differ from those in the simulation studies of Hillis and Bull 
(1993). Still, this would provide a useful ranking to identify taxonomic levels where 
cyt b works relatively well versus relatively poorly for resolving avian phylogenies. 
Applying the 70% criterion, the frequency of resolved nodes is highest for species— 
genera (9 of 13 nodes supported at 70% or more), intermediate for tribes—subfami- 
lies (6 o£ 13) and lowest for families—suborders (3 of 10). Thus, cyt сап be expected 
to work well in resolving relationships among species within genera. It should con- 
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tinue to work well through the levels of tribes and subfamilies, but can be expected 
to resolve nodes at the family level and higher only if the common ancestors existed 
for long periods. 


I. Rate Variation within Lineages 


A detailed analysis of rate variation within and between the four radiations is be- 
yond the scope of this chapter, because ofthe statistical complexities and uncertain- 
ties that need to be discussed. Nonetheless, it is important to know whether substi- 
tution rates within radiations vary substantially. A simple test of one prediction of 
the molecular clock is as follows. Neutral mutations, including silent substitutions, 
should occur along a lineage according to a Poisson process, and should accumulate 
in a stochastic, clocklike manner (Kimura, 1981, 1983; Gillespie and Langley, 1979; 
Gillespie, 1986, 1991). Thus, the accumulation of base substitutions along lineages 
that diverged from a common ancestor should have a Poisson distribution with a 
mean m(t) that depends on the time, 1, since divergence and a variance between 
lineages, s?(t), equal to the mean (Gillespie, 1991). Thus, R(t) = s?(t) /m(t), termed 
the index of dispersion, should be close to one if substitutions have occurred 
according to a molecular clock along lineages diverging as a “starburst” from a 
common ancestor. Estimation of R(t), like all assessments of evolutionary rates, is 
fraught with statistical problems and uncertainty regarding underlying assumptions 
of the model. The problem of statistical independence of lineages crops up again 
when multiple comparisons are made, and two additional phenomena, collectively 
termed lineage effects by Gillespie (1989), tend to inflate estimates of R(t). The two 
phenomena are variations in generation time and estimations based on an incorrect 
phylogeny. 

Our pairwise comparisons are based on two lineages emerging from each of 
several ancestral nodes and are independent of each other (e.g., the lineages diverg- 
ing from a common ancestor leading to Piculus rubiginosus and Colaptes rupicola, are 
independent of the pair of lineages leading to Colaptes auratus and Dryocopus pilea- 
tus). It is possible to estimate the number of substitutions along each branch and 
then the mean and variance for each pair. If rate variation is a pervasive property of 
cyt b evolution, R for pairwise comparisons should tend to be greater than 1. If p is 
the rate of nucleotide substitutions accumulated over a DNA sequence of length k 
along a lineage from the time it splits from a common ancestor to the present (i.e., 
the branch length), then the expected number of substitutions is E(x) = E(kp) = 
ЕЕ(р), and the variance is Var(x) = Var(kp) = k?Var(p). (Note that x is corrected 
for multiple hits because it is based on Tamura—Nei branch distances.) An estimate 
of the dispersion index then is 


R = kS2/p 


4 Taxonomic Resolution Based on Cytochrome b DNA 105 


where p is the average of the two branch lengths and S? is the variance of the two 
branch lengths. (Note that S? is an estimate of the variance implicit in the evolu- 
tionary model and not the sample variance.) 

R values are tabulated in Table II from all of the independent pairwise compari- 
sons (where p for each lineage is the sum of the branch lengths from the ancestral 
node to the branch tip, based on the Tamura- Nei distance and the phylogenies in 
Fig. 4.1). The average R value is 0.90. Although this analysis of rate variation is 
limited in scope, it is apparent that substitution rates for cyt b have not varied sub- 
stantially within the radiations studied here. This 1s not surprising, given that most 
of the substitutions contributing to branch lengths are synonymous; indeed, it 
would be surprising if rate variation were substantial. 


J. How Much Sequence in a Pilot Study 


It would be helpful to be able to determine from a limited amount of sequence data 
whether cyt b will likely resolve relationships in a more expansive phylogenetic 
study. To determine this, we reduced the sequences for each of the four data sets to 
282 bp that correspond to the fragment typically amplified by primers L14841 and 
H15149 (Kocher et al., 1989), and then we repeated the neighbor-joining analysis. 
Twenty-one of 36 nodes across the four phylogenies (Fig. 4.1), which appeared in 
the complete data phylogenies, also appeared in the 282-bp trees, and 15 of 18 
nodes in the complete data phylogenies that were supported by bootstrap propor- 
tions of 7096 or more appeared in the 282-bp phylogeny. Thus, in a pilot study 
comprising sequences derived by amplification with two of the universal primers 
from Kocher et al. (1989), one can clearly determine whether cyt 6 is an appropriate 
choice. 


IV. DISCUSSION 


A. The Window of Taxonomic Resolution 
for Avian Phylogenies 


In what seems to be a commentary on the sociology of molecular systematics as 
much as a criticism of phylogenetic studies based on cyt b, Meyer (1994, p. 280) 
noted: "For laboratories that set out to undertake molecular phylogenetic work and 
need to produce preliminary data for grant proposals, the cytochrome b gene is 
often the first choice. However, the appeal of cytochrome b as an easy ‘beginner’s gene’ 
for phylogenetic work is tarnished by several of its particular shortcomings. One should not 
expect that this gene is going to be the right gene for all questions.” More recently, 
in a detailed analysis of patterns of nucleotide substitutions in cyt b in cranes, Kra- 
jewski and King (1995) quoted the italicized part of Meyer’s conclusion as the lead- 
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in to their counterpoint “. . . that the full utility of cytochrome b has yet to be 
explored." Our study, we believe, shows that there is truth in both these conclu- 
sions. The mitochondrially encoded cyt b gene is good for avian systematics, but its 
utility is greatest in resolving the diversification of birds from the level of species (or 
subspecies) to subfamilies and families in some instances. At higher taxonomic levels 
cyt b might provide some resolution, but sequencing another gene would be a better 
investment. The "particular shortcomings" of cyt b manifest only when trying to 
resolve deeper splits in evolutionary history and, indeed, all of the disappointing 
phylogenetic studies based on cyt b involve attempts to resolve much more ancient 
groups (e.g., Meyer and Wilson, 1990; Kornegay et al., 1993; Avise et al., 1994a,b; 
Graybeal, 1993; Honeycutt et al., 1995). 

The aspects of cyt b divergence that impinge on the phylogenetic utility and 
limitations of the molecule are apparent in Tables I and П and Figs. 4.3 and 4.4 and 
have been discussed particularly by Irwin et al. (1991) and Krajewski and King 
(1995), and reviewed by Meyer (1994). In birds, diverging cyt b sequences accrue 
transition substitutions at a rapid, and more or less constant, rate to the level of 
distinct genera within tribes, and transversions continue to accrue in a similar man- 
ner to the approximate level of superfamilies. These substitutions are primarily syn- 
onymous as evidenced by comparison of the synonymous and nonsynonymous 
substitution columns in Table П. It is also probable that the small burst of amino 
acid substitutions in the early phase of divergence (Fig. 4.4) represents neutral or 
weakly selected substitutions. Aside from these neutral positions, which become 
saturated before the base of the avian tree is resolved, the remaining positions appear 
to be so constrained by selection that it is improbable that they would have become 
substituted, and hence synapomorphies, during the limited periods of shared ances- 
try. But, on the positive side, this period of neutral evolution spans a large and 
interesting portion of the systematic hierarchy of birds; a larger span than in other 
vertebrates because of the phenomenon of downshifting mentioned in the intro- 
duction. Stated another way, a larger fraction of avian diversity has evolved in the 
recent past, relative to that of other vertebrate groups. 

What is the most recent level of diversification for which cyt b might be expected 
to provide resolution? The answer is probably at the level of subspecies, but at this 
level the problems of resolving the gene tree and congruence between the gene tree 
and species tree confound each other (Neigel and Avise, 1986; Pamilo and Nei, 
1988; Smouse et al., 1991; Moore, 1995). With regard to resolving the haplotype 
tree (gene tree), synonymous substitutions in the 13 mitochondrially encoded pro- 
tein genes are among the fastest evolving sequences known. The mitochondrial 
control region has been suggested as a source of variation for studying recent evo- 
lutionary events, but to our knowledge it has not been demonstrated that the con- 
trol region evolves more rapidly than the synonymous positions of the protein- 
coding genes. In any event, it is clear that mitochondrial haplotype trees can be 
resolved down to the level of individual lineages. 
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B. Molecular Clock 


How deep in evolutionary time can cyt b provide useful information about avian 
phylogenies? Shields and Wilson (1987) calibrated the mtDNA clock for geese 
against the fossil record at 2.096 average divergence per million years (MY). Tarr and 
Fleischer (1993) reasoned to the same calibration based on divergences in a complex 
of Hawaiian honeycreepers and the geological history of the islands. Krajewski and 
King (1995) estimated a slower calibration, based on the fossil record, for cranes, 
0.7-1.7% MY. Cranes have a longer generation time (average, 4-5 years to first 
breeding) than geese (average, 3 years), small passerines (1 year), and woodpeckers 
(1 year) (Table 16 in Sibley and Ahlquist, 1990). Krajewski and King (1995) noted 
that the apparent rate reduction in cranes could be related to longer generation 
time. In any case, the calibrations would be useful only over the span of time when 
the relationship between divergence and time is linear. 

To estimate this, we plotted the Tamura—Nei distance for the independent pair- 
wise comparisons against АТ,,Н and determined empirically that the relationship 
is linear up to AT;, H = 6. We then determined the slope of a regression line forced 
through the origin (plot not shown) for values of AT;,H < 6. The slope is 3.0% 
mtDNA sequence divergence/A АТАН unit; so, when the relationship becomes non- 
linear (AT;,H = 6), cyt b sequence divergence between pairs is approximately 18%, 
or 9%/lineage. For small passerines and woodpeckers this corresponds to approxi- 
mately 9 million years before present (MYBP) (2% divergence between lineages or 
1% along a single lineage) and 10.6-25.7 MYBP for the more slowly evolving 
cranes. 


C. Conclusions and Advice 


The mitochondrially encoded cyt b gene should be given consideration, beyond 
providing preliminary data for a grant proposal, in avian phylogenetic studies fo- 
cused on taxa at the levels of approximately subfamilies or families and lower, i.e., 
down to species and subspecies. Indeed, at these levels it is arguably one of the best 
choices one can make, for several reasons. The first reason is that it is a mitochon- 
drial gene because the mitochondrial haplotype tree has a substantially better 
chance of congruence with the actual species tree than a nuclear gene tree, particu- 
larly when internodes are short (Moore, 1995). For this reason and a number of 
others, it is prudent to begin a phylogenetic study with a mitochondrial gene se- 
quence. Among the mitochondrially encoded genes, cyt b may or may not be the 
best choice for resolving the haplotype tree at intermediate taxonomic levels, say, 
families and subfamilies, but it should be as good as any gene for the lower levels 
because most of the variation is synonymous, and cyt b should accrue this type of 
variation as rapidly as any other gene. Protein-coding genes that are less conserved 
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with regard to nonsynonymous substitutions may be more useful at intermediate 
levels because amino acid substitutions would be expected to contribute more sub- 
stantially to the pool of phylogenetic information, but this has yet to be determined. 
The most promising candidates would include the ND and ATPase genes (Brown, 
1983; Desjardins and Morais, 1990). In this same vein, if synonymous substitutions 
provide most of the phylogenetic information for a particular set of taxa, a mito- 
chondrially encoded protein gene is attractive because there are 12 other protein- 
coding genes in the same linkage group accumulating similar variation at roughly 
the same rate. This provides a large, homogeneous pool of informative characters 
for resolving short internodes. A final reason for using cyt b is that it is the best 
known of the avian mitochondrially encoded genes, and an abundance of primers 
has been published. 

With regard to the possibility of combining data from two or more mitochon- 
drially encoded protein-coding genes, DeFilippis (1995) sequenced 1512 bp of the 
COI gene (1548 bp in total length) for the same woodpecker specimens reported 
here except Colaptes rupicola. DeFilippis did phylogenetic analyses on the cyt b and 
COI data sets separately and on the combined data sets. The neighbor-joining trees 
had identical topologies for the three analyses except for the placement of Campe- 
philus haematogaster, which differed in all three analyses, but was weakly supported 
by bootstrap proportions in all three. Campephilus haematogaster appears to have 
shared common ancestry with the rest of the woodpeckers for a short time early in 
the radiation and may be a case in which the precise branching order cannot be 
resolved. The average increase in bootstrap proportions for shared nodes was 7.8% 
in the combined data set tree over the individual data set trees. This is not a dramatic 
increase, but the bootstrap support was strong in both of the individual trees. 

We agree with Graybeal (1993, 1994) and Meyer (1994) in strongly encouraging 
pilot studies before undertaking an exhaustive molecular phylogenetic analysis. If 
the encompassing avian taxon is a family, even a superfamily, cyt b is worth further 
consideration. Another easily obtained, and good, indicator is Sibley and Ahlquist's 
(1991) "tapestry": if the caxon that encompasses the species of interest emanates in 
a bifurcation with a AT;,H > 12.5, consider another gene. If the study group passes 
these tests, cyt b may be the best bet, but not a certain bet. Next, a pilot study should 
be done using the 282-bp fragment amplified with primers L14841 and H15149 
for 5—10 carefully chosen species that likely represent the range of variation for the 
group. Data from the pilot study should be analyzed with regard to transition-to- 
transversion ratios, bootstrap support, and, importantly, the ratios of internode 
lengths (T>) to branch-tip lengths (T;). The plot of bootstrap proportions as a func- 
tion of Т, and Т, indicates, roughly, that a trichotomy will be supported by 70%+ 
bootstrap proportions if the ratio of T; to Т, is no more than 4:1, and the lower the 
ratio the better. Finally, it follows from the facts that because most of the phyloge- 
netically informative variation in cyt b is drawn from a homogeneous pool of syn- 
onymous substitutions, and because the signal is readily detected in a subset of the 
available variation, it is unnecessary to sequence the entire gene, down to the last 
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nucleotide. Rather than verify every nucleotide for a single specimen by sequencing 
both DNA strands, it is better to sequence one strand for two specimens—thus 
keeping the cost the same but greatly reducing the chances of reporting a contami- 
nant sequence. 
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| APPENDICES 


APPENDIXI Woodpecker and Piculet Specimens^ 


Voucher 
Species Common name Locale number 





Campephilus haematogaster Crimson-bellied woodpecker Esmeraldas, Ecuador LSU 11786 


Darien, Panama LSU 2188 
Colaptes auratus Northern flicker Kentucky WSU 8618 
Arizona WSU 86101 
Colaptes rupicola Andean flicker Pasco, Peru LSU 8204 
Puno, Peru LSU 3901 
Dryocopus pileatus Pileated woodpecker Kentucky WSU 8615 
Texas WSU 8634 
Melanerpes carolinus Red-bellied woodpecker Kentucky WSU 8614 
Picoides villosus Hairy woodpecker Arizona WSU 86107 
California WSU 86144 
Piculus rubiginosus Golden-olive woodpecker Lambayeque, Peru LSU 5162 
Lambayeque, Peru LSU 5222 
Picumnus aurifrons Bar-breasted piculet Bolivia LSU 18254 
Bolivia LSU 18479 
Sphyrapicus varius Yellow-bellied sapsucker California WSU 86148 
California WSU 86149 
Veniliornis callonotus Scarlet-backed woodpecker Lambayeque, Peru LSU 5175 
Lambayeque, Peru LSU 5178 
Veniliornis nigriceps Bar-bellied woodpecker La Paz, Bolivia LSU 1305 
Pasco, Peru LSU 8176 





*LSU, Louisiana State University, Museum of Natural Science; WSU, Wayne State University, De- 
partment of Biological Sciences. 
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APPENDIXII Mitochondrial Cytochrome b Sequences Aligned for 21 Specimens of 
Picidae, Representing 11 Species^ 


Co. auratus 8618* 

Co. auratus 86101 

Ca. haematogaster 11786* 
Ca. haematogaster 2188 
D. pileatus 8634* 

D. pileatus 8615 

M. carolinus 8614* 
Picum. aurifrons 18254* 
Picum. aurifrons 18479 
Picul. rubiginosus 5162* 
Picul. rubiginosus 5222 
Pico. villosus 86144* 
Pico. villosus 86107 

S. varius 86148* 

S. varius B6149 

V. callonotus 5178* 

V. callonotus 5175 

V. nigriceps 8176* 

V. nigriceps 1305 

to. rupicola 8204* 

Co. rupicola 3901 


Co. auratus 8618* 

Co. auratus 86101 

Ca. haematogaster 11786* 
Ca. haematogaster 2188 
D. pileatus 8634* 

D. pileatus 8615 

M. carolinus 8614* 
Picum. aurifrons 18254* 
Picum. aurifrons 18479 
Picul. rubiginosus 5162* 
Picul. rubiginosus 5222 
Pico. villosus 86144* 
Pico. villosus 86107 

S. varius 86148* 

S. varius 86149 

V. callonotus 5178* 

V. callonotus 5175 

V. nigriceps 8176* 

V. nigriceps 1305 

Co. rupicola 8204* 

Co. rupicola 3901 


123 
??? 
AAT 
??? 
AA? 
AAT 
??? 
2?С 
2?? 
226 
ААА 
77 
АА? 
АА? 
??5 
эсс 
ААС 
AAC 
AAA 
AA? 
?AT 
?AT 


111 
111 
234 
ACC 
ACC 
ACA 
ACA 
ACC 
ACC 
ACA 
ACA 
ACA 
ACC 
ACC 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACC 
ACC 


456 
?TC 
TTC 
?TC 
TTC 
TIT 
222 
TTC 
TT? 
TTT 
TTC 
TTC 
TTT 
TTT 
TTC 
TTC 
TTT 
ТТТ 
ТТТ 
TIT 
TTC 
ТТС 


111 
111 
567 
TG? 
TGC 
TGC 
TGC 
TGC 
TG? 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 


789 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGC 
GG? 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 


111 
112 
890 
CGA 
CGA 
CG? 
ccc 
CGA 
CGA 
CGC 
CGA 
CGA 
CGA 
CGA 
CGC 
сос 
са? 
CGG 
CGC 
CGC 
CGC 
CGC 
CGA 
CGA 


111 
012 
TCC 
TCC 
TCT 
TCT 
тсс 
тсс 
тст 
тес 
TCT 
тсс 
тсс 
тсс 
TCC 
TCC 
тсс 
тсс 
TCC 
TCT 
TCT 
TCC 
TCC 


111 
345 
CTC 
стс 
CTC 
стс 
стс 
CTC 
стс 
CTT 
с?? 
CTC 
CTC 
CTC 
CTC 
CTT 
CIT 
стс 
стс 
CTT 
CTC 
CIC 
стс 


111 
222 
456 
GTC 
GTC 


? GTC 


GTC 
GTT 
GTT 
GTC 


? GTC 
? GTC 


GTT 
GTT 
GTT 
GTT 
GTC 
GTC 
GTC 
GTC 
GTC 
GTT 
GTC 
GTT 


111 
678 
CTC 
стс 
CIC 
стс 
CTC 
CTC 
CTA 
CTA 
CT? 
стс 
CTC 
TTA 
TTA 


122 
901 
GGT 
GGC 
GG? 
GGC 
GGC 
GG? 
GGC 
GGC 
GGG 
GGC 
GGC 
GGC 
GGC 
GGA 
GGA 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 


111 
333 
012 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
САТ 


? ТАС 


Т2? 
ТАС 
ТАС 
ТАТ 
ТАТ 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 


222 
234 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 


11 
333 
345 
GG? 
GGC 
GGT 
GG? 
GGC 
GG? 
GGC 
GGT 
GGT 
GGG 
GGG 
GGC 
GGC 
GGA 
GGA 
GGC 
GGC 
GGC 
GGC 
GG? 
GG? 


222 
567 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
2G? 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 
TGC 


111 
333 
678 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TG? 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


223 
890 
CTA 
CTA 
CTA 
CTA 
CTG 
CTG 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
TTG 
TTG 
CTA 
CTA 
CTA 
CTA 
TTA 
TT? 


111 
344 
901 
?TA 
CTA 
?ТА 
СТА 
СТА 
СТА 
CET 
?TT 
CIT 
CTA 
CTA 
CTA 
CTA 
стс 
стс 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 


333 
123 
СТА 
СТА 
АТА 
АТА 
АТА 
АТА 
GIT 
ATA 
ATA 
CTA 
CTA 
GTA 
GTA 
TTA 
TTA 
GTA 
GTA 
ATA 
ATA 
CTA 
CTA 


111 
444 
234 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


333 
456 
ACT 
ACT 
ACA 
ACA 
GTA 
G?? 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
GCA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 


11 
4% 
567 
CGA 
со? 
со? 
сас 
CGA 
CGA 
CGC 
са? 
сос 
CGA 
CGA 
сос 
сас 
CGA 
CGA 
сос 
CGC 
сас 
CGT 
CGA 
CGA 


333 
789 


890 


AAC 
AAT 
AAT 
AAC 
AAC 


AAC 
AA? 
AAC 
AAC 
AAC 
AAC 
AAC 
AAC 
AAT 
AAT 


AAC 
AAC 


444 
012 
?TC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
?TC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 


111 
555 
123 
стс 
стс 
стс 
стс 
CTT 
CTT 
CTT 
CTA 
C?A 
стс 
стс 
стс 
стс 
стс 
стс 
crc 
стс 
стс 
стс 
стт 
стс 


444 
345 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATT 
ATC 
A?C 
ATT 
ATT 
GTC 
GTC 
ATT 
ATT 
GTC 
GTC 
GTC 
GTC 
ATC 
ATC 


111 
555 
456 
CAC 
CAC 
CAC 
CAC 
CAT 
CAT 
CAC 
CAC 
CAC 
CAC 
CAC 
CAT 
CAT 
CAC 
CAC 
CAT 
CAT 
CAT 
CAT 
CAC 
CAC 


444 
678 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 


111 
555 
789 
G?C 
GCC 
G?T 
GCT 
GCC 
GCC 
GCT 
GCC 
G?C 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


455 
901 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGT 
GGC 
GG? 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 


555 
234 
стс 
стс 
CTT 
стт 
стс 
стс 
СТА 
СТА 
СТА 
CIT 
стт 
CTC 
стс 
СТА 
CTT 
стс 
стс 
стс 
CTC 
стс 
стс 


111 
666 
345 
GGT 
GGT 
GGA 
GGA 
GGA 
GGA 
GGA 
GG? 
GGC 
GGC 
GG? 
GGG 
GGG 
GGA 
GGA 
GGG 
GGG 
GGG 
GGG 
GGC 
GG? 


555 
567 
CTA 
CTA 
CTA 
CTA 
стс 
стс 
СТА 
стт 
CTT 
TTA 
TTA 
CTA 
CTA 
CTA 
CTA 
CTG 
ста 
CTG 
CTG 
TTA 
TTA 


111 
666 
678 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
сс 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


556 
890 
стт 
CTT 
TTG 
TTG 
CTG 
ста 
CTC 
ATG 
ATG 
стс 
стс 
CTT 
CTT 
CIT 
стт 
CIT 
CTT 
стт 
CTT 
CTT 
CTT 


111 
677 
901 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 


123 
G?C 
?CC 
GCC 
GCC 
GCC 
GCC 
GCC 
ccc 
GCC 
GCT 
GCT 
GCT 
GCC 
GCC 
GCC 
GCT 
GCT 
GCT 
GCC 
GCC 
GCC 


11 


234 
TIC 
TTC 
TTC 
TTC 
ТТТ 
ТТТ 
TTC 
TTC 
?TC 
TTC 
TIC 
TTC 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 


666 
456 
ACA 
ACA 
ACT 
ACT 
ACT 
ACT 
ACC 
ACC 
ACC 
ACA 
ACA 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACA 
ACA 


111 


567 
ТТТ 
ТТТ 
TTC 
TTC 
TIC 
ттс 
ттс 
ттс 
TTC 
TTT 
TIT 
TTC 
TTC 
ТТТ 
ТТТ 
TTC 
TIC 
TTC 
TIT 
TIT 
ТТТ 


789 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САТ 
САТ 
САС 
САС 
САС 
САС 
САС 
САС 
САТ 
САТ 
САС 
САС 
САС 
САС 


111 


890 
TT? 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
ТТТ 
ТТТ 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TTC 
TTC 


012 
TAC 
TAC 
TAT 
TAT 
TAC 
TAC 
TAC 
TAT 
A?T 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


111 


123 
AT? 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


777 
345 
ACC 
ACC 
ACT 
ACT 
ACC 
ACC 
ACT 
ACA 
?СА 
АСС 
АСС 
АСС 
АСС 
АСТ 
АСТ 
АСТ 
АСТ 
АСТ 
АСТ 
АСС 
АСС 


111 


456 
T?T 
TGC 
TGA 
TGC 
TGC 
TG? 
TGC 
TGA 
T?A 
TGT 
TGT 
TGC 
TGC 
TGT 
TGT 
TGT 
TGT 
TGT 
TGC 
TGT 
TGT 


777 
678 
GCC 
GCC 
GCC 
GCC 
GCC 
асс 
GCT 
G?C 
GAC 
GCC 
GCC 
GCC 
GCC 
G?A 
G?A 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


111 


789 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТТ 
АТТ 
?TC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


788 
901 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 


GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 


GAC 
GAC 
GAC 
GAC 


111 


012 
TAC 
TAC 
TAC 
TAT 
TAC 
TAC 
TAT 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


234 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACG 
ACG 
ACA 
ACA 
ACA 
ACA 
ACC 
ACC 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 


111 


345 
с?с 
стс 
CTT 
CTT 
CTC 
стс 
CTG 
TTC 
TAC 
стс 
стс 
ТТА 
ТТА 
СТА 
СТА 
ТТА 
ТТА 
ТТА 
ТТА 
CTC 
CTC 


567 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
AAC 
A?C 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 


11 


CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
сэс 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 


САС 
САС 


889 
890 
СТА 
СТА 
СТА 
СТА 
ТТА 
ТТА 
СТА 
ста 
сте 
CTG 
CTG 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTG 
Ств 


122 


901 
АТТ 
АТТ 
АТС 
АТС 
АТТ 
АТТ 
АТС 
АТС 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 


123 
GCC 
GCC 
GCC 
Gcc 
Gec 
GCC 
Gcc 
асс 
GTC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GcC 
GCC 
Gec 
GCC 
GCC 


222 


234 
GG? 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGG 
GGA 
GGA 
GGA 
GGA 
GGC 
GG? 
GGG 
GGG 
GGA 
GGA 
GGA 
GGA 


456 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
T?C 
T?C 
TIT 
ТТТ 
ТТС 
ТТТ 
TTC 
TTC 


222 


567 
CG? 
сас 
сот 
CGT 
CGA 
CGA 
саб 
CGA 
CGA 
CGT 
CGT 


CGA 
CGG 
Са? 
ccc 
сас 
сас 
CGT 
CGA 
CGA 


789 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCG 
TCG 
TCT 
TCT 
TCT 
TCT 
TCA 
TCA 


222 
001 
890 
GGA 
GGA 
GG? 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GG? 
GGG 
GGA 
GGA 
GGA 
GGA 
GGA 


111 
000 
012 
TCC 
TCC 
TCC 
TCC 
TCC 
TCC 
тсс 
TCC 
TCC 
TCC 
TCC 
TCC 
TCC 
TCT 
TCT 
TCC 
тсс 
тсс 
тсс 
тсс 
тсс 


222 


123 
ТТТ 
ТТТ 
ТТС 
TTC 
TIC 
ттс 
стс 
TIT 
TIT 
TIC 
TTC 
TIC 
TIC 
ттс 
TIC 
TTC 
TTC 
TTC 
TTC 
ттс 
TTC 


111 
000 
345 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTT 
GTC 
Gc 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 


222 
111 
456 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
T?? 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


111 
000 
678 
Gcc 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GAC 
GCC 
GCC 
GCC 
GCC 
Gcc 
acce 
GCC 
GCC 
GCC 
Gcc 
GCC 
Gc 


222 
111 
789 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAT 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


111 
011 
901 
CAC 
CAC 
CAC 
CAC 


CAC 
CAC 
CAC 
?AC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAT 
CAT 
CAC 
CAC 


222 
222 
012 
GGA 
GGA 
GGA 
GGA 
GGG 
GGG 
GGG 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 


(Continues) 
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Co. auratus 8618* 
Co. auratus 86101 


Ca. haematogaster 11786* 
Ca. haematogaster 2188 


D. pileatus 8634* 
D. pileatus 8615 
M. carolinus 8614* 


Picum. aurifrons 18254* 
Picum. aurifrons 18479 
Picul,. rubiginosus 5162* 
Picul. rubiginosus 5222 
Pico. villosus 86144* 
Pico. villosus 86107 
varius 86148* 
varius 86149 
callonotus 5178* 


5. 
5. 
v. 
у. 
v. 
v. 
Co. 
Co. 


Co. 
Co. 
Ca. 


callonotus 5175 
nigriceps 8176* 


nigriceps 


1305 


rupicola 8204* 
rupicola 3901 


auratus 8618* 
auratus 86101 


haematogaster 11786* 
Ca. haematogaster 2188 


D. pileatus 8634* 
D. pileatus 8615 
M. carolinus 8614* 


Picum. aurifrons 18254* 
Picum. aurifrons 18479 
Picul. rubiginosus 5162* 
Picul. rubiginosus 5222 
Pico. villosus 86144* 
Pico. villosus 86107 


5. 
5. 
v. 
v. 
V. 
V. 
Co. 
Co. 


varius 86148* 
varius 86149 


callonotus 5178* 


callonotus 5175 


nigriceps 
nigriceps 
rupicola 
rupicola 


8176* 
1305 
8204* 
3901 


222 
222 
345 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
ТСА 
TCT 
TCT 
TCC 
тсс 
тсс 
тсс 
тес 
TCC 
TCC 
TCC 
TCC 
TCC 
TCC 
TCT 


333 
333 
456 
GCA 
GCA 
GGA 
272 
GCT 
GCT 
GCT 
GCG 
GCG 
GCA 
GCA 
GCA 
G?? 
G?? 
GCA 
GCA 
GCA 
GCG 
GCA 
GCA 
GCA 


(Continued) 


222 
222 
678 
TAC 
TAC 
TAT 
TAT 
TAC 
TAC 
TAC 
TAC 
?AC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


333 


789 
ACC 
ACC 
ACC 
22? 
ACC 
ACC 
ACT 
ACC 
ACC 
A?C 
A?C 
??C 
??C 
A?C 
ACC 
A?C 
A?C 
ACC 
ACC 
ACC 
ACC 


222 
233 
901 
CTG 
CTG 
CTA 
CTA 
CTA 
CTA 
CTA 
CTC 
с?с 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
ТТА 
ТТА 
CTG 
CTA 


333 
444 
012 
GTC 
GTC 
ТТТ 
GTT 
GTT 
GTT 
GTT 
GTA 
GTA 
GTT 
GTT 
GTT 
GTT 
GTC 
GTC 
GTT 
GTT 
GTT 
GTT 
GTC 
GTC 


222 
333 
234 
ТТТ 
ТТТ 
TTC 
TTC 
TTT 
TIT 
TIT 
T?C 
T?C 
ттс 
ттс 
TTC 
TTC 
ТТТ 
ТТТ 
ТТТ 
TIT 
TIT 
TIT 
TTC 
TIC 


333 


345 
ATT 
ATT 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 


М-БЕЕРЕЕЕЕЕЕРЕЕЕРЕРЕЕЕЕ 


> > > > 0 5 
осос-ғ 
>>>>оғк 


АСТ 
АСТ 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 
АСА 


222 
444 
123 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACG 
A?C 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
A?T 
ACC 
ACC 
ACC 
ACT 
ACC 
ACC 


333 
555 


crc 
стс 
СТА 
СТА 
стт 
cr? 
CTA 
CTA 
CTA 
cic 
стс 
ТТА 
ТТА 
СТА 
СТА 
СТА 
СТА 
СТА 
ТТА 
стс 
стс 


222 
444 
456 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


333 
555 
567 
TTC 
TIC 
TTC 
TIC 
TTC 
T?C 
TIT 
TTC 
TIC 
TTC 
TTC 
TIT 
TIT 
TTC 
TIC 
TIC 
TTC 
TTC 
TIC 
TTC 
TTC 


TCG 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 


222 
555 
012 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 
ACC 
ACA 
ACA 
ACC 
ACC 
ACA 
ACA 
ACT 
ACT 
ACA 
ACA 
ACA 
ACG 
ACT 
ACT 


333 


123 
G?C 
G?C 
всс 
GCC 
GCC 
G?C 
GCT 
GCA 
GCA 
GCC 
всс 
всс 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


222 
555 
345 
GGA 
GGA 
GGA 
GGA 
GGG 
GGG 
GGA 
GG? 
GGT 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 


333 


456 
стс 
стс 
стс 
crc 
CTC 
CTC 
TTC 
TIC 
TTC 
crc 
CTC 
стс 
cic 
TTC 
TTC 
cre 
стс 
стт 
стс 
cre 
CTC 


222 
555 
678 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTT 
GIT 
GTT 
GTT 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 


333 


789 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
CCT 
ccc 
ccc 
CCT 
CCT 
сст 
CCT 
CCT 
CCT 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 


222 


901 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATT 
AIT 


333 


012 
TAC 
TAC 
TA? 
TAC 
TAC 
TA? 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


222 
666 
234 
стс 
стс 
стс 
стс 
стс 
стс 
CTT 
стс 
стс 
стс 
стс 
стс 
стс 
CTT 
cr? 
стт 
CIT 
стт 
CTT 
crc 
стс 


333 


345 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
GTA 
GTA 
ATT 
ATT 
GTA 
GTA 
GTA 
GTA 
ATT 
ATT 


222 


567 
CTC 
стс 
стс 
стс 
стт 
CTT 
CIT 
CTA 
С?А 
СТС 
стс 
CTT 
CTT 
CTT 
стт 
стс 
стс 
стс 
стс 
CTC 
CTC 


333 


678 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGA 
GGA 
GGC 
GGC 
GGA 
GGA 
GGA 
GGA 
GGC 
GGC 


222 
667 
890 
CTC 
стс 
стс 
стс 
cic 
CTC 
стс 
стс 
стс 
стс 
CTC 
CTT 
CIT 
CTA 
CTA 
стс 
стс 
стс 
стс 
стс 
crc 


222 
777 
123 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 


333 


234 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 
ACT 
ACT 
ACC 
ACC 
ACT 
ACT 
ACT 
ACT 
ACC 
ACC 


222 


456 
cic 
стс 
CIT 
CTT 
стт 
CIT 
стс 
СТА 
СТА 
CTT 
CIT 
CIT 
стт 
CTT 
CTT 
стс 
стс 
стс 
cic 
cic 
стс 


333 


567 
АТТ 
АТТ 
АТТ 
АТТ 
АТС 
АТС 
АТТ 
АТС 
АТС 
АТС 
АТС 
АТС 
АТС 
АТТ 
АТТ 
АТС 
АТС 
АТС 
АТС 
АТТ 
АТС 


222 


789 
АТА 
АТА 
АТА 
АТА 
АТА 
АТА 
АТА 
АТА 
?TA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 


333 


890 
GTC 
GTC 
GTC 
GTC 
GTT 
GTT 
GTC 
GTC 
GTC 
GTT 
GTT 
GTA 
GTA 
GTT 
GTT 
GTC 
GTC 
GTC 
GTC 
GTT 
GTT 


222 


012 
G?C 
G?C 
GCA 
GCA 
GCC 
G?C 
GCC 
GCC 
G?C 
GGC 
GGC 
GCC 
GCC 
GCA 
G?A 
GCC 
GCC 
GCC 
GCC 
GCC 
G?C 


222 


345 
ACT 
ACT 
ACT 
ACT 
ACT 
ACT 
ACT 
ACC 
ACC 
ACT 
ACT 
ACC 
ACC 
ACT 
ACT 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 


333 


456 
TGA 
TGA 
TGG 
TGG 
TGG 
TGG 
TGG 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


222 


678 
GCC 
GCC 
GCC 
G?C 
GCT 
G?T 
GCC 
GCC 
G?? 
Gcc 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCT 
GCT 


333 


789 
GCC 
эсс 
GCC 
Gcc 
GCC 
G?C 
GCC 
GCT 
GCT 
GCT 
GCT 
GCC 
GCC 
Gcc 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


222 
899 
901 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
Т?с 
TIC 
TIC 
TTC 
ттс 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
ттс 
TTC 


444 
000 
012 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


222 
999 
234 
GT? 
GTA 
G?C 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTA 
GTA 
GTC 
GTC 
G?C 
G?C 
GTC 
GTC 
GTC 
GTC 
GTA 
GTA 


444 
000 
345 
GGT 
GGT 
GG? 
GGC 
GGC 
GG? 
GGA 
GGG 
GGG 
GGT 
GGT 
GGG 
GGG 
GGG 
GGG 
GGG 
GGG 
GGA 
GGA 
GGC 
GGT 


222 
999 
567 
GGC 
GGC 
GGT 
GGT 
GGC 
GGC 
GGC 
GGT 
GGT 
GGC 
GGC 
GGC 
GGC 
GGC 
GG? 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 


444 
000 
678 
GGT 
GGT 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGG 
GGG 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 


223 
990 
890 
TAC 
TAC 
TAC 
TAC 
TAT 
TAT 
TAC 
TAC 
TAC 
TAT 
TAT 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAT 
TAT 


444 
011 
901 
TIC 
TTC 
TT? 
TT? 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TIT 
TIT 
TTC 
TTC 
TTC 
TTC 
TIC 
TIC 


333 
000 
123 
GTT 
GTT 
GTC 
GTC 
GTC 
GTC 
GTT 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTT 
GT? 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 


444 
111 
234 
TCC 
TCC 
TC? 
TCC 
TCT 
TCT 
TCT 
TCA 
TCA 
TCC 
TCC 
TCC 
TCC 
TCC 
TCC 
TCT 
TCT 
TCT 
TCT 
TCC 
TCC 


333 
000 
456 
CTG 
CTG 
стс 
стс 
стс 
crc 
стс 
стс 
стс 
сте 
CTG 
стс 
стс 
стс 
стс 
стс 
CTC 
cr? 
стс 
CTG 
CTG 


444 
111 
567 
GTA 
GTA 
GTG 
GTG 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 


333 
000 
789 
ccc 
ccc 
CCT 
CCT 
сст 
CCT 
ccc 
ccc 
ccc 
Gec 
?CC 
ccc 
ccc 
CCA 
CCA 
CCT 
сст 
ccc 
ccc 
ccc 
ccc 


444 
112 
890 
GAC 
GAC 
GA? 
GA? 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 


GAC 
GAC 
GAC 
GAC 
GAC 


GAT 


333 
111 
012 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


333 
111 
345 
GGA 
GGA 
GGA 
GGA 


GGG 


GGG 
GGA 
GGA 
GG? 
GGA 


GGA ? 


GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGG 
GGG 


44% 
222 
456 
ссс 
ссс 
CCT 
ССТ 
ссс 
ссс 
ССА 
ССА 
ССА 
ссс 
ссс 
ссс 
ссс 
CCT 
CCT 
ccc 
ccc 
CCT 
ccc 
ccc 
ccc 


444 
222 
789 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACA 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACA 
ACA 
ACT 
ACT 
ACT 
ACT 
ACC 
ACC 


333 
122 
901 
ATA 
ATA 
ATG 
222 
АТА 
АТА 
АТА 
ATG 
AT? 
ATA 
ATA 
ATA 
ATA 
ATA 
AT? 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 


444 
333 
012 
стс 
CTC 
стс 
стс 
стс 
стс 
стс 
стт 
CIT 
стт 
CTT 
стс 
CTC 
стс 
CTC 
стс 
стс 
стс 
CTC 
стс 
стс 


333 
222 


ТСА 
ТСА 
TCG 
2” 
ТСА 
ТСА 
ТСА 
ТСА 
Т?А 
TCG 
TCG 
TCG 
TCG 
TCC 
TCC 
TCA 
TCA 
TC? 
TCA 
TCG 
TCG 


444 
333 
345 
ACC 
AIC 
ACT 
ACT 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 


333 
222 
567 
TTC 
TTC 
TTC 
22? 
TTC 
TTC 
ТІС 
TIC 
TTC 
TIC 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
??C 
TCC 
TIC 
TIC 


444 
333 
678 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 
CGA 


333 
223 
890 
TGA 
TGA 
TGA 
727 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 


444 
344 
901 
TIC 
TTC 
TTC 
TTC 
TIT 
TT? 
ТІС 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 
ТТТ 
ТТТ 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 


333 
333 
123 
GGT 
GGT 
GGG 
222 
GGC 
GGC 
GGG 
GGG 
GGG 
GGC 
GGC 
GGG 
GGG 
GG? 
GGC 
GGG 
GGG 
GGG 
GGA 
GG? 
GGT 


444 
44% 
234 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
ТТС 
TTC 
TIT 
TTT 
TTT 
TTT 
ТТС 
TTC 
TTC 
TTC 
TTC 
TTC 


(Continues) 


LV 


APPENDIXII (Continued) 

444 444 

444 445 

567 890 
Co. auratus 8618* GCT CTC 
Co. auratus 86101 G?C CTC 
Ca. haematogaster 11786* GCC TTA 
Са. haematogaster 2188 GCC TTA 
D. pileatus 8634* GCA CTT 
D. pileatus 8615 G?A CTT 
M. carolinus 8614* GCC CTG 
Picum. aurifrons 18254* GCC CTA 
Picum. aurifrons 18479 GCC СТА 
Picul. rubiginosus 5162* GCC CTC 
Picul. rubiginosus 5222 GCC CTC 
Pico. villosus 86144* G?C CTA 
Pico. villosus 86107 G?C CTA 
S. varius 86148* G?T CTA 
S. varius 86149 G?T CTA 
V. callonotus 5178* GCC CTA 
V. callonotus 5175 G?C CTA 
V. nigriceps 8176* GCC CTA 
V. nigriceps 1305 G?C CTA 
Co. rupicola 8204* Gcc СТС 
Co. rupicola 3901 G?C CTC 

555 555 

555 566 

678 901 
Co. auratus 8618* AAA ATC 
Co. auratus 86101 AAA ATC 
Ca. haematogaster 11786* AAA ATT 
Ca. haematogaster 2188 ААА ATT 
D. pileatus 8634* AAA ATC 
D. pileatus 8615 AA? ATC 
M. carolinus 8614* AAA ATC 
Picum. aurifrons 18254* AAA ATC 
Picum. aurifrons 18479 AAA ATC 
Picul. rubiginosus 5162* AAA ATC 
Picul. rubiginosus 5222 AAA ATC 
Pico. villosus B6144* AAA ATT 
Pico. villosus 86107 AAA ATT 
S. varius 86148* AAA ATC 
S. varius 86149 AAA ATC 
V. callonotus 5178* AAA ATC 
V. callonotus 5175 AAA ATC 
V. nigriceps 8176* AAA ATC 
V. nigriceps 1305 AAA ATC 
Co. rupicola 8204* AAA ATC 
Co. rupicola 3901 AAA ATC 


444 
555 
123 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 


555 


234 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
CCA 
CCA 
ссс 
ссс 
ссс 
ссс 
CCA 
CCA 
ccc 
ccc 
сст 
ссс 
ссс 
ссс 


444 
555 
456 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIT 
TTC 
ттс 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TIT 
TIT 
TIT 
TIT 


TTC 


TTC 
TTC 
TIC 
TIT 
T?? 
TTC 
ТТС 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIT 
TTC 
ттс 


444 
555 
789 
стс 
стс 
стс 
стс 
CTT 
стт 
CTC 
CTC 
crc 
CIT 
CTT 
стс 
стс 
CIT 
CIT 
стс 


444 
666 
012 
CTC 
CTC 
CTC 
CTC 
CIT 
ст? 
CTT 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 
стс 


555 
777 
123 
ссс 
ссс 
ССА 
ССА 


444 
666 
345 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
ccc 
ccc 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 
ССА 


555 
77 
456 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
222. 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 


444 


678 
TIC 
TTC 
TTC 
TIC 
TTC 
T?C 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
ттс 


555 
777 
789 
TTC 
TTC 
TIC 
TT? 
TTC 
TTC 
TTC 
227 
2% 
ТТС 
ТТС 
ттс 
ттс 
ттс 
ттс 
TTC 
TTC 
TIC 
TTC 
TTC 
ттс 


444 
677 
901 
TTA 
TTA 
ATA 
ATA 
CTA 
ATA 
ATA 
ATA 
ATA 
CTA 
?TA 
CTA 
CTA 
ATA 
ATA 
CTA 
CTA 
TTA 
TTA 
CTA 
CTA 


555 


012 
тсс 
тсс 
TCC 
TCC 
TCC 
TCC 
ACC 
22С 
22С 
ТСТ 
ТСТ 
TCC 
тсс 
тст 
TCT 
TCC 
тсс 
тсс 
тсс 
тст 
тст 


444 
777 
234 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
?TC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
ATT 
ATT 
ATC 
ATC 


555 
B88 
345 
GTA 
GTA 
ACA 
ACA 
GTA 
GTA 
GTA 
CTA 
CTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
?ТА 
GTA 
GTA 


444 
777 
567 
GCA 
G?A 
GCA 
GCA 
GCA 
GCA 
GCA 
GCC 
GCC 
GCA 
GCA 
GCA 


8 


SESSESEES SER SESEES 


AAA 
AAA 
AAA 


444 
778 
890 
GGC 
GGC 
GGC 
GGC 
GGT 
GGT 
GGA 
GGG 
GGC 
GGT 
GGT 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGG 
GGA 
GG? 
GG? 


555 


901 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAT 
GAT 
GAT 
GAT 
GAC 
GAC 


444 
888 
123 
стс 
crc 
стс 
стс 
CTT 
стс 
CT 
CTA 
CTA 
CTC 
стс 
crc 
cic 
стс 
стс 
стс 
crc 
CTC 
CTC 
CTT 
CIT 


555 


234 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


444 
888 
456 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 


555 


567 
CTA 
CTA 
CTA 
CTA 
CTG 
ста 
CTT 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
CTA 


444 
888 
789 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 


556 
990 
890 
GGC 
GGC 
GGA 
GGA 
GGC 
GG? 
GGA 
GGA 
GGA 
GGC 
GGC 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGC 
сес 


444 
999 
012 
ATT 
ATT 
GTC 
GTC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATT 
ATC 
ATC 


666 
000 
123 
TIC 
TTC 
TIC 
TTC 
тіс 
TTC 
TTC 
TIG 
TTG 
TTC 
TIC 
TTC 
TTC 
TIT 
ТТТ 
TTC 
TTC 
TTC 
TTC 
TTC 
ттс 


444 
999 
345 
CAC 
CAC 
CAT 
CAT 
CAC 
CAC 
CAT 
CAC 
CAC 
CAC 
CAC 
CAT 
CAT 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAT 
CAC 


000 
456 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
GTA 
GTA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 


444 
999 
678 
TTC 
TIC 
TTC 
TTC 
ТТТ 
TT? 
TIT 
TTC 
TIC 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 


000 
789 
TTC 
ттс 
TTC 
TIC 
TTC 
TTC 
CTT 
TTC 
TTC 
ттс 
ттс 
TTC 
TTC 
CTT 
стт 
TTC 
TTC 
TTC 
TTC 
TTT 
TTC 


455 
900 
901 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACC 
ACT 
ACT 
ACC 
ACC 
ACT 
ACT 
ACC 
ACT 
ACC 
ACC 


666 
111 
012 
ATA 
ATA 
ATA 
ATA 
ATG 
ATG 
ATA 
GTG 
GTG 
ATA 
ATA 
ATG 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 


555 
000 
234 
TTC 
TIC 
TIT 
TIC 
TTC 
TIC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TIC 
TTC 
TTC 


666 
111 
345 
стс 
стс 
стс 
стс 
стс 
стс 
TTC 
CIT 
ETT 
crc 
стс 
стс 
стс 
СТА 
СТА 
стс 
стс 
стс 
стс 
стс 
стс 


555 
900 
567 
crc 
crc 
стс 
стс 
ETI: 
CTC 
стс 
стс 
стс 
стс 
crc 
cic 
стс 
Cic 
стс 
crc 
cic 
crc 
стс 
стс 
стс 


111 
678 
cic 
стт 
стс 
стс 
стс 
стс 
crc 
crc 
стс 
crc 
стс 
стс 
crc 
стс 
стс 
TIC 
TTC 
cic 
CIC 
стс 
стс 


555 
001 
890 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 
САС 


122 
901 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
CcT 
сст 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 


555 
111 
123 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GA? 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GAA 
GA? 
GAA 
GAA 
GAA 


222 
234 
стс 
ст? 
стс 
стс 
CTT 
CTT 
cic 
TTA 
TTA 
cic 
cic 
crc 
стс 
CTT 
CTA 
стс 
CTC 
стс 
стс 
стс 
Cc 


555 
111 
456 
TCC 
TCC 
TCA 
TCA 
TCA 
TCA 
TCA 
TCG 
TCG 
TCT 
TCT 
TCT 
TCT 
TCA 
TCA 
TCC 
TCC 
TCT 
TCC 
TCT 
TCT 


222 
567 
ACG 
A?G 
GTA 
GTA 
GTA 
GTA 
ACA 
ACA 
ACA 
ACG 
ACG 
ACA 
ACA 
ACA 
?CA 
GCA 
GCA 
GTA 
GT? 
ACA 
ACA 


555 
111 
789 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GGC 
GG? 
GGC 
GGT 
GGT 
GGC 
GGC 
GGC 
GGC 
GGC 
GG? 
GGC 
GG? 
GG? 
GGC 


223 


G?C 
Gcc 
GCC 
Gc 
ACC 
эсс 
АСС 
АСС 
АСС 
6?С 
GGC 
GGC 
G?C 
всс 
Gc 
ACC 
ACC 
ACC 
?CC 
ACC 
ACC 


555 


012 
TCA 
TCA 
TCA 
TCA 
TCG 
TCG 
тсс 
тсс 
тсс 
ТСА 
ТСА 
ТСА 
ТСА 
тст 
тст 
TCG 
TCG 
TCG 
TCG 
TCA 
TCA 


333 
123 
CTA 
CTA 
стс 
стс 
СТА 
СТА 
СТА 
crt 
CIT 
TTA 
TTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 


456 
GCA 
G?A 
GCA 
GCA 
GCA 
GCA 
GCA 
TCC 
TCC 
GCA 
GCA 
GCA 
GCA 
GCA 
G?A 
GCA 
GCA 
GCA 
GC? 
GCA 
GCA 


789 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 


555 
233 
901 
ccc 
ccc 
сст 
ccT 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
сст 
сст 


666 
444 
012 
TTC 
ттс 
TTC 
TIC 
TTC 
TIC 
TIC 
TTC 
TTC 
TIC 
TIC 
ТТТ 
ТТТ 
ттс 
ттс 
TTC 
TTC 
TTC 
TIC 
ттс 
ттс 


555 
333 
234 
стс 
стс 
СТА 
СТА 
CIT 
стс 
СТА 
стс 
стс 
стс 
стс 
стс 
стс 
CIT 
CIT 
CTA 
CTA 
стс 
стс 
CTT 
CIT 


444 
345 
TCA 


TCA 
TCA 
TCA 
TCA 
TCC 
TTA 
TTA 


TCA 
TCA 


TCG 
TCG 
TCA 
TCA 
TCA 
TCA 
TCA 
TCA 


555 


567 
GGG 
GGG 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 
GGA 


448 
678 
ссс 
эсс 
сст 
?CT 
ccc 
ccc 
?CC 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
сст 
CCT 
ccc 


555 
334 
890 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
GTC 
ATC 


ATC 


555 
464 
1e3 
GTA 
GTA 
GTA 
GTA 
GTG 
GTG 
GTA 
A?A 
ACA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 
GTA 


555 
234 
CTC 
стс 
CTG 
CTG 
стс 
стс 
CTT 
crc 
стс 
стс 
стс 
стс 
crc 
crc 
CTC 
CTC 
crc 
стс 
crc 
crc 
стс 


555 
444 
456 
TCT 
TCT 
тес 
тсс 
тсс 
тсс 
TCT 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 


555 
567 
СТА 
СТА 
ТТА 
ТТА 
стс 
?TG 
TTA 
CTA 
CTA 
TTA 
TTA 
TTA 
TTA 
TTA 
TTA 
CTA 
CTA 
CTA 
CTA 
CTA 
TTA 


555 
444 
789 
GAC 
GAC 
GAC 
GAC 
GAT 
GAT 
GAT 
GAC 
GAC 
GAC 
GAT 
GAC 
GAC 
GAT 
GAT 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 


666 
556 
890 
GGA 
GGA 
GGT 
GGG 
GGG 
GGG 
GGT 
GGT 
GGT 
GGA 
GGA 
GGC 
GGC 
GGT 
GGT 
GGA 
GGA 
GGA 
GGA 
GGA 
GGG 


555 
555 
012 
TGT 
TGT 
TGC 
TGC 
TGT 
TGT 
TGT 
TG? 
TG? 
TGT 
TGT 
TGT 
TGT 
TGC 
TG? 
TGC 
TGC 
TGC 
AG? 
TGT 
TGT 


666 
666 
123 
GAC 
GAC 
GAC 
GA? 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 
GAC 


555 
555 
345 
GAC 
GAC 
GAC 
GAC 
GAT 
GA? 
GAC 
GA? 
GAC 


GAT 
GAC 
GAC 
GAC 
GAC 
GAT 
GAT 
GAT 
GAT 
GAC 
GAC 


666 
666 
456 
CCA 
CCA 
ccc 
22? 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
CCA 
ССА 
ССА 
с?5 
ССА 
ССА 


(Continues) 


SIT 


APPENDIX П (Continued) 
666 666 666 666 
666 777 777 777 
789 012 345 678 
Co. auratus 8618* GAA AAC TTC AC? 
Co. auratus 86101 GAA AAC TTC ACG 
Ca. haematogaster 11786* GAA AAC TTC ACA 
Ca. haematogaster 2188 777? 777 ??? ??? 
D. pileatus 8634* GAG AAC TTC ACA 
D. pileatus 8615 GAG AAC TTC ACA 
M. carolinus 8614* GAA AAC TTC ACA 
Picum. aurifrons 18254* GAA AAC TTC ACA 
Picum. aurifrons 18479 GAA AAC TTC ACA 
Picul. rubiginosus 5162* GAA AAC TTC ACA 
Picul. rubiginosus 5222 GAA AAC TTC ACA 
Pico. villosus 86144* GAA AAC TTC ACA 
Pico. villosus B6107 GAA AAC TIC ACA 
S. varius 86148* GAA AAC TTC ACA 
S. varius 86149 GAA AAC T?C ACA 
V. callonotus 5178* GAA AAC TTC ACA 
V. callonotus 5175 GAA AAC TTC ACA 
V. nigriceps 8176* GAA AAT TTC ACA 
V. nigriceps 1305 GAA AAT TT? ACG 
Co. rupicola 8204* GAA AAC TTC ACA 
Co. rupicola 3901 GAA AAC TTC ACA 
777 ТТ? 777 TTT 
778 888 888 888 
890 123 456 789 
Co. auratus 8618* GTC CTA GCC CTA 
Co. auratus 86101 GTC CTA GCC CTA 
Ca. haematogaster 11786* GTC CTC GCT CT? 
Ca. haematogaster 2188 GTC CTC GCT CT? 
D. pileatus 8634* GTC CTG GCC CTA 
D. pileatus 8615 GTC CTG GCC CTA 
M. carolinus 8614* GTC CTG GCC TTA 
Picum. aurifrons 18254* ATG CTA GCC TTA 
Picum. aurifrons 18479 АТС CTA GCC TTA 
Picul. rubiginosus 5162* GTC CTA GCC CTA 
Picul. rubiginosus 5222 GTC CTA GCC CTA 
Pico. villosus 86144* GTC CTA GCC CTA 
Pico. villosus 86107 GTC CTA GCC CTA 
S. varius 86148* GTC CTA GCT CTA 
5. varius 86149 GTC CTA GCT CTA 
V. callonotus 5178* GTC CTA GCC TTA 
V. callonotus 5175 GTC CTA GCC TTA 
V. nigriceps 8176* GTC CTA GCC CTA 
V. nigriceps 1305 GTC CTA GCC CTA 
Co. rupicola 8204* GTC CTA GCC CTA 
Co. rupicola 3901 GTC CTA GCC CTA 


ccc 
cct 
сст 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 


777 


012 
Gcc 
GCC 
GCT 
GCT 
всс 
GCC 
GCC 
GCC 
GCC 
GCT 
GCT 
GCC 
GCC 
всс 
GCC 
GCT 
GCT 
GCT 
GCT 
GcC 
GCC 


GCA 
GCA 
GCA 
GCA 
GCA 


777 


345 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
всс 
GCC 
GCC 
GCC 
Gcc 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


666 
888 
567 
AAC 
AAC 


? AAC 


22? 
AAC 
??? 
AAC 
AAC 
AAC 
AAC 
AAC 
AAC 
AAC 
AAT 
AAT 
AAC 
AAC 
AAC 
AAC 
AAC 
AAC 


777 
999 
678 
TCT 
TCC 
TCC 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
ТСА 
ТСА 
тсс 
тсс 
ТСА 
ТСА 
ТСА 
ТСА 
тсс 
тсс 


666 
889 
890 
ccc 
ccc 
ccc 


GTC 
GT? 
GT? 
GTC 
GTC 
GTC 
ATC 
ATC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTC 
GTT 
GTT 


666 666 666 


999 
123 
CTA 
CTA 
CTG 


000 


CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 


999 
456 
GTC 
GTC 
G?T 
??? 
GTC 
G?C 
GTA 
ATC 
ATC 
GTC 
GTC 
GTT 
GTT 
GTC 
G?C 
GTC 
GTC 
GTC 
GT? 
GTC 
GTC 


888 
000 
567 
ATC 
ATC 
ATY 
ATT 
GTC 
GTC 
ATC 
GCT 
GCT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATT 
GTC 
GTC 


999 
789 
ACT 
ACT 
ACA 
ACA 
ACA 
ACA 
ACG 
ACA 
ACA 
ACT 
ACT 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACT 
ACT 


888 
001 
890 
CTA 
CTA 
CTA 
CTA 
TTA 
TTA 
CTA 
crc 
стс 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 


777 
000 
012 
ccc 
ccc 
сс? 
ссс 
ссс 
ссс 
ссс 
ССА 
ССА 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 


111 
123 
ТТТ 
ТТТ 
TTC 
TT? 
TTC 
TTC 
TTC 
ТТС 
TTC 
ттс 
ттс 
ттс 
ттс 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TIT 


777 
000 
345 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
ссс 
GET. 
сст 
ссс 
ссс 
ccc 
ССТ 
ССА 


CCT 
ССТ 
ссс 
ссс 
ссс 
ссс 


111 
456 
СТА 
СТА 
?TA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
TTA 
TTA 


777 
000 
678 
CAC 
CAC 
CAT 
CA? 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAT 
CAT 
?АТ 
?АТ 
САС 
САС 
САС 
САС 
САС 
САС 


111 
789 
cce 
GCC 
Gc? 
GCC 
А?А 


АТТ 
ccT 
CcT 
GCC 
6CC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 


777 
011 
901 
АТС 
АТС 
АТ? 
АТТ 
АТТ 
АТТ 
АТС 
АТС 
АТС 
АТС 
АТС 
АТТ 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


222 
012 
сст 
CCT 
ccc 
ccc 
ccc 
ccc 
ccc 


CAA 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 
ccc 


TT? ТТТ 777 777 


111 
234 


ESERSESSSSESS 


> 
Ed 
e 


EHHH 


стс 
стс 
стс 
CTC 
стс 
CTC 
TTT 


111 
567 
ccc 
ccc 
cc? 
ccc 
ccc 
ссс 
ссс 
CCT 
CCT 
ccc 
ccc 
ccc 
ccc 
ccT 
CCT 
CCA 
CCA 
CCA 
CCA 
ccc 
ссс 


888 
222 
678 
стс 
crc 
стс 
стс 
стс 
стс 
стс 
СТА 
СТА 
стс 
CTC 
стс 
стс 
стс 
стс 
CTT 
CTT 
CTT 
CTT 
crc 
CTC 


112 
890 
GAA 
GAA 
GAA 
GAA 
GAA 


CAT 
CAC 
сас 
Cc 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAC 
CAT 
CAC 
CAC 


222 
123 
TGG 
TGG 
TGA 
TGG 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGA 
TGG 
TGG 
TGA 
TGA 
TGG 
TGG 
TGG 
TGG 
TGA 
TGA 


888 
333 
234 
ATA 
ATA 
GTA 
?TA 
ATG 
GTG 
ACA 
ATG 
ATG 
ACA 
ACA 
ACA 
ACA 
GTA 
GTA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 


777 
222 
456 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
ТАС 
TGC 
TGC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 
TAC 


B88 
333 
567 
TCC 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тсс 
тес 
тст 
тст 
тст 
TCT 
TCC 
TCC 
тес 
тсс 
тсс 
тсс 


777 
222 
789 
TTC 
TIC 
TTC 
TTC 
ттс 
TTC 
TTC 
TCC 
TCC 
TTC 
TTC 
TIC 
TTC 
ТТТ 
ТТТ 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 


BESTE 


111101411013: 241- 


777 
333 
012 
СТА 
СТА 
СТА 
СТА 
СТА 
СТА 
ТТА 
СТА 
СТА 
ста 
CTG 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 
CTA 


123 


777 
333 
345 
TTC 
TIC 
TTT 
TIT 
TTC 
TTC 
ТТТ 
TTC 
TTC 
TTC 
TTC 
TIT 
TIT 
T?C 
T?C 
ТТТ 
TIT 
TIT 
ТТТ 
TIT 
ТТТ 


888 
444 
456 
CGC 
CGC 
CGT 
CGT 
CGC 
ccc 
CGT 
GAC 
GAC 
CGT 
CGT 
CGC 
CGC 
сас 
сас 
сас 
сас 
Cac 
CGC 
25? 
сас 


777 
333 
678 
GCA 
G?A 
GGA 
G?A 
G?A 
G?A 
GCT 
CCA 
CCA 
GCA 
GCA 
GCA 
GCA 
GTA 
GTA 
GCA 
GCA 
G?A 
GCA 
G?A 
G?A 


444 
789 
ACA 
ACA 
ACT 
ACC 
ACA 
ACA 
ACA 
ATA 
ATA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACA 
ACG 
ACG 
ACG 


777 777 777 777 


344 
901 
TA? 
TAT 
TA? 
TA? 
TAC 
?АС 
ТАТ 
ТАТ 
ТАТ 
ТАТ 
ТАТ 
ТАС 
ТАС 
ТАС 
ТАС 
ТАТ 
ТАТ 
ТАТ 
ТАТ 
ТАТ 
ТАТ 


888 
555 
012 
ATG 
ATG 
ATA 
ATA 
ATA 
ATA 
ATA 
ATG 
ATG 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 
ATA 


444 
234 
GCC 
G?C 
GTC 
GCC 
GAC 
G?C 
GCT 
GCC 
GCC 
GCC 
GCC 
GCC 
GCC 
GCT 
GCT 
GCT 
GCT 
acc 
GCT 
Gcc 
G?C 


888 
555 
345 
GCC 
GCC 
всс 
GCC 
GCC 
GCC 
GCC 
GGC 
GGC 
GCC 
GCC 
GCC 
GCC 
GCC 
?CC 
GCT 
GCT 
GCC 
GCC 
GCC 
GCC 


444 
567 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATT 
ATT 
ATT 
ATC 
ATC 
ATT 
ATT 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 
ATC 


888 
555 
678 
TTC 
TTC 
TTC 
TTC 
TTC 
TTC 
TIC 
TTC 
TTC 
TTC 
TIC 
TTC 
ттс 
TTC 
TTC 
TTC 
TIC 
TTC 
TIC 
TTC 
TTC 


445 
890 
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INTRODUCTION 


e mid- to late nineteenth century was the heyday of comparative vertebrate tax- 


onomy (e.g., Garrod, 1874; Gadow, 1892). Many subfamilial relationships were 
self-evident even before the Darwinian revolution. At higher taxonomic ranks, 
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however, there is a discontinuity in morphological similarity that still obfuscates the 
relationships of many families and orders of birds (e.g., compare Cracraft, 1981; 
Olson, 1985; Sibley and Ahlquist, 1990). 

Gruiformes present many unsolved mysteries of systematics and biogeography. 
They include many highly diverged, depauperate or monotypic families scattered 
patchily the world over. Most previous attempts to resolve their phylogeny have 
yielded conflicting results. 

Molecular data hold great promise for resolving problem phylogenies. The 12S 
rDNA is used here to address gruiform phylogeny because it includes both evolu- 
tionarily labile and conserved regions, hence it is believed to have a broad window 
of resolution for addressing recent and very ancient divergences (Kocher et al., 1989; 
Mindell and Honeycutt, 1990; Simon et al., 1994). We achieve some resolution of 
gruiform phylogeny and contribute some basic description of the evolution of the 
12S rDNA gene. Complementary analysis of morphological characters is treated 
elsewhere (Houde, in preparation). 


A. The Birds 


Gruiformes traditionally include rails, coots, and gallinules (Rallidae, 120 species 
worldwide; e.g., Gallinula, Laterallus, Rallus), roatelos (Mesitornithidae, 3 spe- 
cies, Madagascar, e.g., Mesitornis), hemipodes (Turnicidae, 15 species, from Africa 
through southern Eurasia to Australia, e.g., Turnix), the Australian plains-wanderer* 
(Pedionomidae, 1 species, Australia, not used in this study), finfoots and sungrebe 
(Heliornithidae, 3 species, pantropical, e.g., Heliornis, Podica), the kagu (Rhyno- 
chetidae, 1 species, New Caledonia, i.e., Rhynochetos), the sunbittern (Eurypygidae, 
1 species, South America, i.e., Eurypyga), trumpeters (Psophiidae, 3 species, South 
America, i.e., Psophia), seriemas (Cariamidae, 2 species, South America, e.g., Car- 
iama), the limpkin (Aramidae, 1 species, Neotropics, i.e., Aramus), cranes [Gruidae, 
14 species, cosmopolitan except South America, i.e., Anthropoides (= Grus), Balear- 
ica, Grus], and bustards [Otididae, 23 species, from Africa through southern Eurasia, 
to Australia, e.g., Ardeotis (= Choriotis)] (Sibley and Monroe, 1990). We use the 
familial nomenclature of Wetmore (1960). 

At the morphological extremes of this assemblage are the small quail like hemi- 
podes and the large long-legged cranes. Seriemas, trumpeters, kagu, and limpkin 
are lanky and superficially cranelike, with stubby tails (except seriemas). Seriemas, 
trumpeters, sunbittern, finfoots, kagu, and some rails are forest dwellers. Hemi- 
podes, bustards, and cranes inhabit prairies or steppes, although cranes also inhabit 
wetlands. Limpkin, finfoots, and rails prefer watery habitats. 


Phylogenetic Issues 


Most of what is commonly believed about the relationships of Gruiformes can be 
traced to descriptive comparative anatomy and phyletic inferences made more than 


?Lowercase lettering for proper names of birds has been used throughout this chapter to conform 
with the editorial policy of this book. 
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Moas 
Anseriformes 
б herons 
kiwis Plains-wanderer roatelos 
bustards hoatzin 
ostriches rails / seriemas Sunbittern 





trumpeters Charadriiformes 


cranes 
Aptornis 


FIGURE 5.1 Relationships of Gruiformes and others according to Olson’s (1985) phenetic assessment 
of morphology and paleontology (tree interpreted from text only). Branch lengths are not proportional 
to distance. 


a century ago (e.g., Fürbringer, 1888; Gadow, 1892). Within the past three decades, 
however, new issues in gruiform phylogeny and systematics have been raised and 
old ones rekindled. 

Olson worked mainly with rails (Olson, 1973, 1975, 1977, 1985), maintaining 
that trumpeters are sister to rails and resurrecting an old idea that the extinct flight- 
less Aptornis (= Apterornis) from New Zealand is closer to kagu than to rails (Bed- 
dard, 1898 vs Brodkorb, 1967). As strong proponents of gruiform polyphyly (see 
Fig. 5.1), Olson and Steadman (1981) asserted that Australian plains-wanderer and 
bustards are Charadriiformes, not Gruiformes (Olson, 1985). Olson further sug- 
gested that sunbittern and/or roatelos are relicts of an ancient assemblage that in- 
cludes herons (Olson, 1979, 1985). Olson considered ibises to be intermediate be- 
tween Charadriiformes and Gruiformes. Olson and Steadman marshaled an eclectic 
assortment of (mostly) osteological traits in support of their hypotheses. 

Cracraft erected a group, "Psophii," to which limpkin, then cranes are sisters, 
respectively (Cracraft, 1981, 1982; see Fig. 5.2). In it, trumpeters and seriemas form 
a sister clade to sunbittern and kagu plus Aptornis. The effort by Cracraft was the 
first attempt at a cladistic analysis of gruiform osteology, but no formal analysis was 
available for his 26 characters. Cracraft, like Olson, advocated a closer relationship 
between Aptornis and kagu than between kagu and sunbittern. 

Sibley and colleagues came the closest to treating all traditionally recognized 
Gruiformes in a single analysis (Sibley and Ahlquist, 1985, 1990; see Fig. 5.3). They 
corroborated the treatment by Steadman and Olson of plains-wanderer. However, 
their more contemporary, noncommittal reconstructions were to replace the nearly 
fully resolved dichotomous trees of their earlier works (Sibley et al., 1993). In their 
treatise, Phylogeny and Classification of Birds, hemipodes were placed as sister to all 
neognathous birds except fowl and waterfowl. They separated rails from all other 
Gruiformes at the subordinal rank “Ralli” (Sibley and Ahlquist, 1990). Roatelos 
were unstudied but also placed in their own suborder. Last, the suborder “Grues” 
consisted of a ladderized tree beginning apically with cranes, then limpkin plus sun 
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Kagu 
Aptornis 








Sunbittern 
trumpeters 
seriemas 


cranes 


rails (and others?) "Psophii" 


suborder "Ralli" suborder "Grues" 


FIGURE 5.2 Relationships of Gruiformes and others according to Cracraft’s (1982) cladistic analysis 
of morphological characters. Note the fundamental dichotomy of Gruiformes into suborders Ralli and 
Grues. Branch lengths are not proportional to distance. 


grebe, trumpeters, seriemas or kagu or both, bustards, and then finally sunbittern at 
the base. It will be of interest below that in one figure they illustrated a sister relation 
of bustards and seriemas, even though in others they did not (1990: Fig. 335 vs 
Fig. 363). They stridently advocated a close sister relationship for limpkin and sun 
grebe, but not for kagu and sunbittern. However, both sungrebe and kagu were 
removed from their 1993 publication. In it, the only interfamilial relationship they 
advocated was between trumpeters and cranes. They found Gruiformes and Cha- 
radriiformes to be broadly indivisible (i.e., possibly not mutually monophyletic). 


cranes 
Sungrebe 
seriemas Limpkin 












rails trumpeters 


suborder "Ralli" 
Sunbittern 


almost all other birds 


hemipodes suborder "Grui" 


FIGURE 5.3 Relationships of Gruiformes and others according to Sibley and Ahlquist’s (1985; 1990) 
reconstruction from DNA hybridization. Note the exclusion of hemipodes from Gruiformes and the 
fundamental dichotomy of Gruiformes into suborders Ralli and Grui. Branch lengths are not propor- 
tional to distance. 
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Houde argued that the treatment by Sibley and colleagues of the Neotropical 
sungrebe and limpkin had profound implications for interpretation of character 
polarity and biogeography (Houde, 1994; Houde et al., 1995). Sibley and Ahlquist 
(1990) intimated that the two might be more closely related to one another than to 
the other two species of finfoots (are to sungrebe), which are endemic to Africa 
and Asia. Morphological character transformations constrained to this or similar 
topologies are one-third to one-half as parsimonious as on unconstrained mor- 
phology trees. Houde eventually dismissed the previous DNA hybridization results 
as irreproducible, rejected the monophyly of the limpkin-sungrebe clade, sup- 
ported the monophyly of finfoots, and reaffirmed their longheld sister relationship 
to rails (Gadow, 1892; Sibley and Ahlquist, 1972; Olson, 1973; Cracraft, 1982; 
Houde, 1994). 

Several questions are within the scope of the present analysis. (1) Are tradi- 
tionally recognized Gruiformes monophyletic? In particular, should bustards and 
hemipodes be included in a monophyletic Gruiformes or Charadriiformes? Are 
seriemas related to secretary-bird? Are sunbittern or roatelos related to herons? 
(2) Apart from roatelos, does the first branch in Gruiformes separate all raillike birds 
from all cranelike birds, as in the subordinal classifications of Cracraft and Sibley 
and Ahlquist? (3) Are the Psophii of Cracraft monophyletic? If not, then are trum- 
peters sister to seriemas (as by Cracraft), cranes (as by Sibley and Ahlquist), or rails 
(as by Olson)? (4) Are finfoots monophyletic, and are they most closely related to 
the limpkin or to rails? (5) Are sunbittern and kagu sister taxa? (6) Is the fossil Ap- 
tornis more closely related to the kagu or to rails? 


B. The Gene 


12S rDNA is the smaller (about 1 kilobase) of two mitochondrial ribosomal DNAs. 
rDNA "gene" products are nonprotein-coding rRNAs that complex with proteins 
to form a ribosome. rRNAs fold onto themselves, like peptides, with evolutionarily 
conserved secondary structure (Fig. 5.4). The 12S gene can be subdivided into four 
principal domains, each of which includes self-complementing "stem" and single- 
stranded "loop" regions. Substitutions within stem regions may be selected against 
or precipitate compensatory (nonindependent) substitutions to maintain functional 
stem structure. Single-stranded regions may be involved in temporary base pairing 
with tRNAs and DNA templates during translation (Watson et al., 1987), and in 
binding and cross-linking the many proteins that collectively make up the larger 
ribosomal particles (Noller et al., 1990). 

The popular wisdom that loops are evolutionarily more labile than stems 1s not 
entirely accurate (Vawter and Brown, 1993; Simon et al., 1994). Some loops contain 
regions of variable length, but so do some stems. There are motifs within loops that 
are invariant, from microbes to vertebrates; while some positions within stems are 
among the most variable of sites in the gene. 





FIGURE 5.4 Mitochondrial 12S rDNA, hypothesized structure for domains 1-1II modified from 
Sullivan et al. (1995), with stems boxed and numbered according to Van de Peer et al. (1994). The 
sequence shown is a strict consensus for gruiform birds. Arrows indicate gaps in sequence numbering for 
alignment with outgroups. Font size of sequence is proportional to site diversity index (Shannon and 
Weaver, 1949; see Section II,A,6). Insertion/deletion sites are indicated by italics. 
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The rate of evolution of the 12S rDNA gene is believed to be appropriate to the 
level of phyletic divergence we aim to address, i.e., within late Cretaceous and Ter- 
tiary times (Mindell and Honeycutt, 1990). rDNA has been used for phylogenetic 
inference at greater and lesser taxonomic ranks (e.g., Kocher et al., 1989; Simon et 
al., 1994; Cummings et al., 1995). 


П. METHODS 


A. DNA Sequence Data 


1. Sources 


Most DNA samples were obtained from ultrafrozen or chemically preserved soft 
tissues [Gruiformes: Aramus guarauna, Ardeotis (= Choriotis) sp., Heliornis fulica, Mes- 
itornis unicolor, Laterallus melanophaius, Rallus longirostris, Turnix sp.; Charadriiformes: 
Larus heermani; Ciconiiformes: Phimosus infuscatus] or whole blood [Gruiformes: 
Anthropoides (= Grus) virgo, Balearica pavonina, Cariama cristata, Eurypyga helias, Gal- 
linula chloropus, Grus canadensis, Psophia crepitans}, but a few were obtained from mu- 
seum skins and skeletons [Gruiformes: Podica senegalensis, Rhynochetos jubatus, Aptor- 
nis (= Apterornis) sp.]. Calcium salts were completely removed from bone by 
chelation with EDTA. DNA was isolated from tissues by proteinase K digestion, 
phenol—chloroform extraction, and ethanol precipitation and, when necessary, pu- 
rification by glass milk or anion-exchange column (Sambrook et al., 1989). The 
12S rDNA gene was molecularly cloned by polymerase chain reaction (PCR) (Mul- 
lis and Faloona, 1987; Saiki et al., 1988). The gene was amplified intact by priming 
on conserved flanking tRNAs that readily permit amplification by PCR in novel 
organisms (e.g., Kocher et al., 1989), except when DNA derived from museum 
specimens was used (Houde and Braun, 1988; Cooper et al., 1992). Sequencing 
templates were constructed by A exonuclease digestion of asymmetrically phos- 
phorylated PCR products (Higuchi and Ochman, 1989), which were sometimes 
gel purified (Kretz et al., 1989). The method of sequencing was direct PCR se- 
quencing using dideoxy chain termination (Sanger ef al., 1977), as modified (En- 
gelke et al., 1988; Sheen and Seed, 1988). Sequencing primers were spaced at inter- 
nal sites across both strands, specific to chicken sequence, in conserved regions 
chosen by alignment of sequences of diverged lineages (e.g., Anderson et al., 1981; 
Clary and Wolstenholme, 1985; Desjardins and Morais, 1990). Sequences were 
verified from both strands, except in the most 5' region (varies between taxa). 
Negative controls lacking template DNA were run to check against contaminating 
DNA in reaction mixtures and pipettors. Negative-control DNA extracts of mu- 
seum specimens were carried through every step from extraction to sequencing. 
DNA extraction and PCR setup were performed in a remote, dedicated PCR-free 
laboratory. 
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Approximately 870 bases representing domains I-III were obtained for all in- 
group taxa except Aptornis, Podica, and Rhynochetos (GenBank accession numbers 
1776011-76027). The latter are represented by 673, 336, and 388 bases, respec- 
tively. 

Unpublished complete 12S rDNA sequences of outgroups stone curlew (Char- 
adriiformes: Burhinidae: Burhinus oedinemus), night heron (Ciconiiformes: Ardei- 
dae: Nyctannasa violacea), and secretary bird (Falconiformes: Sagittariidae: Sagittarius 
serpentarius) were kindly provided by D. P. Mindell, and the complete sequence of 
chicken (Galliformes: Phasianidae: Gallus gallus; Desjardins and Morais, 1990), par- 
tial sequence of sandpiper, gull, and murre [Charadriiformes: Scolopacidae: Cali- 
dris maritima (X76362), Laridae: Larus canus (X76361), and Alcidae: Uria aagle 
(X76435), respectively; Moum et al., 1994] and stork [Ciconiiformes: Ciconiidae: 
Ciconia nigra (L33370); Hedges and Sibley, 1994] were obtained from GenBank. 


2. Sampling Considerations 


We used 17 gruiform species to represent an order that includes 196. Thus, auta- 
pomorphies of the species sampled may be mistaken for synapomorphies of families 
(e.g., Patterson et al., 1993). This problem is diminished because 8 of the 12 grui- 
form families include 3 or fewer species. Rallidae are the only family with more 
than 25. Our sampling in no way represents the nominal diversity within Rallidae 
but should address its interfamilial relations since rallid monophyly is supported by 
both molecular and morphological studies (Olson, 1973; Sibley and Ahlquist, 
1990). We sampled 32% of all genera and 74% of the nonrallid genera in the order. 
All traditionally recognized gruiform families except the plains-wanderer are rep- 
resented, making this the most comprehensive investigation of gruiform molecular 
systematics to date. There is agreement from molecular and morphological phenetic 
studies that the plains-wanderer is not gruiform (Olson and Steadman, 1981; Sibley 
and Ahlquist, 1985, 1990). 

We sampled one or two individuals (rarely three) per species, with about equal 
frequency across taxa. Moore (1995) showed that internode lengths in four recently 
evolved woodpecker species were almost always longer than coalescence time for 
mitochondrial DNA (mtDNA). Thus, lineage sorting (Avise et al., 1984) should 
rarely if ever corrupt phylogeny reconstruction for groups with equal or greater 
internode lengths (i.e., supraspecific levels; age of divergence is inconsequential). 
Gene phylogeny should adhere to organismal phylogeny, and between-species 
variation should exceed within-species variation. Accordingly, the two most closely 
related species in this study [Grus canadensis and Anthropoides (= Grus) virgo] display 
numerous transition substitutions. We never detected sequence differences in any 
two samples of one species, except one apparently heteroplasmic individual with a 
single transition substitution. 

Small samples increase the risk that species misidentifications will go unnoticed. 
We detected one mislabeled specimen (Balearica mislabeled as Cariama) only because 
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we sequenced other samples of both species. Voucher specimens (a live bird in this 
case) are no assurance against sample mislabeling. 


3. Sequence Alignment 


Sequence alignment was initiated with a pairwise similarity measure (MacVector 
4.14; Needleman and Wunsch, 1970) and was improved by individual discretion 
(see below). Sequences were fitted to a map of secondary structure to identify com- 
plementary positions (e.g., Kjer, 1995). In so doing, discrepancies between opposite 
strands resulting from “compressions” (i.e., bases missing on one strand but not the 
other) were discovered and resolved (Fig. 5.5). The mapping of sequences onto 
structural models also served to monitor the possible existence of nuclear pseudo- 
genes of mtDNA sequences (Fukuda et al., 1985). The hypothesis of an endosym- 
biont origin of mitochondria predicts the existence of nuclear copies of mitochon- 
drial genes because the mitochondrial genomes themselves are depauperate in 
housekeeping genes (Gray et al., 1984). Inasmuch as nuclear pseudogenes are re- 
leased from selective constraints, loss of conserved binding motifs and stem com- 
plementarity would be conspicuously absent in nuclear copies of mitochondrial 
rDNA. We detected no such instances. 

Further improvement of alignment was made according to the principle of in- 
teractive phylogenetic weighting (Feng and Doolittle, 1987; Hein, 1990; Konings 
et al., 1987; Lake, 1991; Mindell, 1991; Thorne and Kishino, 1992). Regions of 





FIGURE 5.5 Sequencing artifact. Autoradiograph of sequencing gel showing a common sequencing 
artifact in mitochondrial 125 rDNA, domain III, stem 32 (Eurypyga helias shown). Left: Double loading 
of L strand. Right: Double loading of H strand, reverse complemented; arrows indicate G and C bases 
not evident on opposite strands. 
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variable length were subjected to successive bouts of phylogeny reconstruction 
separately from the complete data set using maximum parsimony following altera- 
tions of alignment (Section П,А,6). We wanted to determine how “badly” (i.e., 
counter to available phylogenetic information) alignments could be contrived be- 
fore traditionally recognized monophyletic families no longer associated with 
themselves in phylogeny reconstruction (i.e., Gruidae, Rallidae, and Heliornithi- 
dae). Some effect of variable alignment was observed, but most often alignment had 
little or no consequence on phylogeny reconstruction in this study. In this data set 
synapomorphies of close taxa usually provided sufficient phylogenetic signal to re- 
construct sister relationships, whether the synapomorphies are aligned to gaps or to 
a background of sequence "noise" of questionable homology (i.e., randomized 
sequence). 

When no phylogenetic information was available, we strived to minimize any 
impact of alignment on phylogenetic inference. When phylogenetic information 
was available, we made alignments according to a parsimony principle of invoking 
the fewest number of changes between sequences from well-supported sister taxa. 
This could just as likely involve the insertion of gaps in nonhomologous positions 
to maintain sequence alignment as insertion of gaps at homologous positions. Se- 
quence alignments were finalized according to a distance optimality criterion based 
on majority segregation of purines vs pyrimidines (i.e., minimizing inferred trans- 
versions across all taxa without reference to a hypothesis of phylogeny). 


4. Transformation Weighting 


As transformations saturate, the observed ratio of transversions to transitions devi- 
ates significantly from the instantaneous ratio (Brown et al., 1982; Mindell and Ho- 
neycutt, 1990; Knight and Mindell, 1993). In other words, the ratio of transversions 
to transitions appears much closer to 1:1 for deep divergences than for shallow 
divergences. In spite of this, approximately the same intrinsic difference in rates of 
transversions to transitions probably occurred throughout the evolutionary history 
of a group, i.e., all levels of divergence. Thus, weighting schemes for phylogeny 
reconstruction should attempt to employ the instantaneous ratio rather than one 
averaged across all levels of divergence, including those saturated. Transversion 
weighting makes the difference between the recovery or lack of recovery of the 
traditionally recognized monophyletic clades Gruidae, Rallidae, and Heliornithidae 
in many of our phylogeny reconstructions. The monophyly of these families is sup- 
ported in whole or part by a variety of morphological and DNA studies, employing 
both phenetic and cladistic methodologies (Olson, 1973, 1985; Sibley and Ahlquist, 
1990; Krajewski, 1989; Houde, 1994; Krajewski and Fetzner, 1994). 

We estimated a ratio of instantaneous rate of transversions to transitions from 
two most parsimonious phylogenies, one large and not known to be correct (in- 
cluding all 17 gruiform taxa herein), and the other small but believed to be correct 
(а ladderized tree of Grus, Anthropoides, Balearica, Aramus, Psophia, and Gallus). Both 
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trees produced identical results on relative transformation rates. The ratio of ob- 
served transversions to transitions was expressed as a function of total substitutions. 
In several cases where no transversions were observed, they were assigned a value of 
1 to preclude the biologically meaningless ratio of infinity. A second-order regres- 
sion was fitted to the plots and the intercept at one substitution was calculated. The 
instantaneous ratio is 6:1, even though there was substantial scatter of observed 
values near the origin (from 2:1 to 19:0). 

We also estimated the transformation ratio by fitting a two-parameter model of 
transversion and transition rates to both the large and small phylogenies (Kimura, 
1980). We used TREECALC (Milligan, 1994) to find the transformation ratio with 
the maximum likelihood (Felsenstein, 1981) for the sequence matrix given a user- 
specified topology with specified branch lengths. Like the first approach, this esti- 
mates the instantaneous rate of transformations rather than an average rate of change 
over the entire phylogeny. The resulting ratio of 7.3:1 from both large and small 
phylogenies is in fairly good agreement with the previous estimate. We observed 
no difference in topology of optimal trees obtained by changing the weighting of 
transversions from 6 to 7.3 and we used the larger value for the phylogeny recon- 
structions presented here. 

In reality, A/Y transversions appear to outnumber G/Y transversions in our data 
by about an order of magnitude. In light of the high frequency of transitions, the 
few G/Y transversions may primarily represent A/Y transversion followed by 
A/G transition. A weighting scheme of 6:1 for A/Y transversions and 60:1 for 
G/Y transversions produced similar bootstrap support for the same groups as the 
7.3:1 weighting for all transversions, but performed worse in recovering Gruidae 
monophyly. 

Gaps were treated in two ways in parsimony analyses: as missing data and with 
an intermediate weight of 4 (to satisfy the triangle inequality for weights of all trans- 
formation types) with 8:1 transversion-to-transition weights. 


5. Position Weighting 


The number of substitutions per site was estimated from the most parsimonious tree 
of 17 ingroup taxa obtained with 7.3-to-1 transversion weighting (Table I). Small 


TABLEI Frequencies of Substitutions per Site 


Number of substitutions per site 


Total Variable 1 2 3 4 5 6 7 8 





Number of sites 934 395 104 90 76 62 36 20 6 1 
Percentage of sites 100 42.3 11.1 9.6 8.1 66 3.9 21 06 01 


Percentage of substitutions 100 100 94 162 20.5 223 16.2 108 38 07 


132 P. Houde et al. 


differences in tree topology (i.e., maximum likelihood and neighbor-joining trees) 
have almost no effect whatsoever on the number of substitutions per site. We never- 
theless acknowledge that observed values both overestimate and underestimate ac- 
tual values. 

Four weighting schemes were applied in subsequent bouts of phylogeny recon- 
struction: equal (weight = 1), reverse (weight = x), inverse (weight = 1/x), and 
quadratic weights (weight = 1/x?, where x = number of substitutions per site). 
Each was used with and without transformation weighting, on complete and par- 
titioned data sets, and in jackknife and bootstrap analyses. Reverse and inverse 
weighting produced identical trees. Equal weights performed best overall at recov- 
ering traditionally recognized families. Quadratic weighting performed worst. 

Stems were not weighted differently from loops to account for compensatory 
substitutions because stems, loops, and bulges evolve at about the same rate (Vawter 
and Brown, 1993). 

Since among-site evolutionary rate variation is known to complicate phylogeny 
reconstruction (e.g., Milkman and Crawford, 1983; Huelsenbeck and Hillis, 1993), 
intuition dictates that position weighting would improve phylogenetic estimates. It 
did not appear to work well in this study. 


6. Data Partitioning 


Nucleotide data were analyzed in total and in subsets. Variable length and flank- 
ing regions that were subject to alternative alignment consisted of sites 82—116, 
140—149, 245-259, 320-323, 339—347, 352—358, 409-424, 433—438, 520- 
528, 707-711, 808, 809, 817—822, and 902-908. Each of the aforementioned 
position and transformation weighting schemes was used in phylogeny reconstruc- 
tion with and without these sites removed from the data set. Inclusion of variable- 
length regions tended to improve tree resolution and bootstrap support of nodes. 
Many synapomorphies of the traditionally recognized families occur within variable 
length regions. 

We analyzed bases 500—920 (the “12Sa-b” region of Kocher et al., 1989) sepa- 
rately to see whether this region was representative of the entire gene. In short, the 
answer is no. Although bootstrap values for some clades increased compared with 
those obtained from the entire data set, the monophyly of cranelike birds was lost. 

We partitioned data according to number of changes per site and analyzed each 
class individually and in groups. This approach addresses among-site evolutionary 
rate variation and saturation of most-variable sites. Popular wisdom holds that sites 
that change most are most homoplasious (e.g., Sullivan et al., 1995). Thus, one 
might rationalize their removal. We were surprised to discover, however, that phy- 
logeny reconstruction using sites that change least yield thousands of equally parsi- 
monious trees. Least-variable sites appear to be the only ones lacking phylogenetic 
information. 

The inconsequence of saturation in our data is suggested by a regression of site 
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consistency index, CI, to site diversity index, H' (Shannon and Weaver, 1949). H' 
is the sum of all nucleotide frequencies at a given position times the natural log of 
that nucleotide frequency. It is a measure of the amount of variation observed at a 
position. The maximum value is obtained by equal frequency of each of the four 
bases at a position; the minimal value is obtained by invariant sites. Site diversity is 
not a measure of substitution rate. A single substitution can result in either high or 
low H', depending on where in the phylogeny it occurred. The slope of the re- 
gression of H' and Cl is —0.09 (P = 0.0006; variable sites only), indicating that CI 
does not vary as a function of site diversity in these data. In other words, most- 
diverse positions are no more or less consistent than least-diverse positions. Substi- 
tution rate may be correlated with consistency, but substitution rate and CI cannot 
be legitimately regressed because their calculations are not independent. 


B. Phylogeny Reconstruction 


Trees are constructed and evaluated using parsimony [MP, in MacClade 3.03, 
(Maddison and Maddison, 1992), and PAUP 3.1.1 (Swofford, 1993); all searches 
performed with heuristic algorithm and optimized by accelerated transformation], 
dynamically weighted parsimony (DWP; Williams and Fitch, 1990), maximum 
likelihood (ML, in PHYLIP 3.5; Felsenstein, 1989), and neighbor-joining [N], in 
MEGA 1.01, (Saitou and Nei, 1987; Kumar et al., 1993) and PHYLIP 3.5 (Felsen- 
stein, 1981, 1989)]. Figures of trees were created with TreeView (Page, 1996). 
Empirical base frequencies used in ML and NJ reconstructions are as follows: A, 
0.32033; C, 0.28244; G, 0.19667; T, 0.20056. 

The recovery of traditionally recognized groups—cranes and limpkin (Gruidae 
plus Aramidae), rails (Rallidae), and finfoots (Heliornithidae)— within larger phy- 
logeny reconstructions is our standard for the reliability of reconstruction and 
weighting methods. The monophyly of each of these has traditionally been ac- 
cepted (e.g., Wetmore, 1960), and is supported at least in part by DNA hybridiza- 
tion and sequences and morphology (Olson, 1973, 1985; Krajewski, 1989; Sibley 
and Ahlquist, 1990; Krajewski and Fetzner, 1994; Houde, 1994; Houde et al., 1995; 
and reviews therein). We acknowledge the potential circularity of seeking answers 
to phylogenetic questions in trees that use the recovery of accepted groups as a 
standard for evaluating trees. 

Dynamically weighted parsimony and NJ using gamma distances are methods 
that are specifically designed to cope with among-site evolutionary rate varia- 
tion; yet, these performed worst at recovering expected clades. DWP approximates 
trees that are otherwise obtained from inverse and quadratic position weighting 
schemes in MP. We were particularly frustrated by the profound effect that seed 
trees have on final trees in the WTSUBS program. Had this not been a factor, then 
we would not have felt constrained by the limits on taxa (i.e., nine) that are allow- 
able in an exhaustive search using the ALLTOPS program. We abandoned the NJ 
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routine in MEGA after discovering that it could not accept IUPAC symbols. Two- 
parameter distances (Kimura, 1980) come closer than gamma distances (Tajima and 
Nei, 1984) to producing trees that recover traditionally recognized family groups. 

All trees are rooted using chicken (Gallus). Although outgroups are ideally close 
sisters to the ingroup and chicken is not close to Gruiformes, certainty of outgroup 
status is paramount. Charadriiformes would be the obvious choice for outgroups if 
traditional classifications truly embody evolutionary history. However, all of the 
other "outgroup" taxa we examined in this study (Charadriiformes, Ciconiiformes, 
and Falconiformes) were chosen specifically to test for their potential relations as 
unrecognized ingroups. Chicken is the only taxon available that is known with 
certainty to be an outgroup. 


III. PHYLOGENETIC INFERENCES 
A. Results 


Certain groupings of taxa were partly or entirely consistent in spite of reconstruc- 
tion methods, weighting and data partitioning, and low bootstrap support (Felsen- 
stein, 1985). The only clades with MP bootstrap values in excess of 95% in the 
complete data set with all weighting and data partitioning regimes were Anthro- 
poides—Grus, Rallus— Laterallus- Gallinula, and Eurypyga—Rhynochetos. All NJ boot- 
strap values for the same clades were lower, suggesting poorer resolution. We be- 
lieve the MP tree using 7.3: 1 transversion weighting and no position weighting is 
the best estimate of phylogeny (Fig. 5.6). We restrict all further discussion to that 
tree unless stated otherwise. In it, Heliornis—Podica receives 92% bootstrap support 
(if gaps are also assigned weight, then support for finfoots is > 95%). Figure 5.6 
summarizes the results of jackknife (Lanyon, 1985) and bootstrap support for each 
node in the parsimony analysis under six different weighting schemes on the com- 
plete ingroup matrix. Most values are low, especially for deep nodes, and indicative 
of high levels of homoplasy. Higher bootstrap values for trees constructed with sub- 
sets of ingroup taxa are discussed further in the context of particular issues. 

Maximum likelihood and NJ trees differ from the MP tree in movement of 
several branches by one node. Maximum likelihood also groups Ardeotis with fin- 
foots (Fig. 5.7). Neighbor-joining groups Psophia and Turnix as sisters to cranes, and 
groups Balearica with Aramus, rather than with gruines (Fig. 5.8). Neighbor joining 
with variable-length regions of sequence omitted and 9:1 transversion weighting 
restores gruid monophyly, retains the position of Psophia, and places Turnix as sister 
to cranes plus rails. Maximum likelihood and NJ trees are both about 0.5% longer 
than the MP tree for character data. All branches on the ML tree are significantly 
positive (p < 0.01). All branches on the NJ tree are positive, but many internodes 
are extremely short (e.g., Ardeotis—Cariama). 

Cranelike versus raillike birds do not comprise a fundamental dichotomy (i.e., 
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FIGURE 5.6  Parsimony jackknife and bootstrap tree of Gruiformes mitochondrial 125 rDNA, do- 
mains I-III, obtained with 7.3:1 transversion weighting using PAUP 3.1.1 (Swofford, 1991). First six 
numbers shown on each branch are jackknife (Lanyon, 1985) consensus values: (1) 7.3:1 transver- 
sion weighting, no position weighting, all characters; (2) 7.3: 1 transversion weighting, inverse position 
weighting, all characters; (3) 7.3: 1 transversion weighting, no position weighting, variable length regions 
excluded; (4) 7.3: 1 transversion weighting, inverse position weighting, variable length regions excluded; 
(5) 8:1 transversion weighting, gaps weighted 4, no position weighting; and (6) 8:1 transversion weight- 
ing, gaps weighted 4, inverse position weighting, respectively. Seventh number is the highest bootstrap 
support obtained under any of the six weighting schemes (100 replicates). Asterisks indicate all weighting 
schemes that provided bootstrap values of 95% or higher or the single weighting scheme that yielded the 
highest bootstrap value if less than 95%. Jackknife values preceded by “contra” indicate majority consen- 
sus levels that contradict the topology shown. Alternate branching topology and values are shown for 
Mesitornis and Turnix. Variable length regions include sites 82-116, 140—149, 245—259, 320—323, 339- 
347, 352—358, 409—424, 433—438, 520—528, 707—711, 808, 809, 817—822, and 902—908. 
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FIGURE 5.7 Maximum likelihood tree of Gruiformes mitochondrial 128 rDNA, domains I-III, 
obtained with 7.3:1 transversion weighting using PHYLIP 3.5 (Felsenstein, 1989); In(likelihood) — 
—6669.4. All branches significantly positive (p < 0.01). Empirical base frequencies used are as follows: 
A, 0.32033; C, 0.28244; G, 0.19667; T, 0.20056. 


most basal split) in Gruiformes in any of our reconstructions. Nine-taxon trees (An- 
thropoides, Grus, Gallinula, Laterallus, Eurypyga, Rhynochetos, Gallus, and any two of 
Balearica and/or Aramus and/or Psophia; 7.3:1 transversion parsimony; 1000 repli- 
cates) yielded bootstrap values >98% in support of a separation of cranes, trum- 
peter, and rails from sunbittern and kagu (Fig. 5.9). In almost all reconstructions 
from the complete ingroup data set, seriema, bustard, roatelo, and hemipode are 
outside of the cranes—rails clade. 

Bootstrap support for the trumpeter-rails clade is lacking in the complete data 
set but rises to 90% in the “12Sa-b” subset with 4:1 transversion weighting. The 
ML and MP trees agree, but INJ always groups trumpeter with cranes. Parsimony 
tree lengths are trumpeter—cranes (1215, 30815, unordered and 7.3:1 transversion 
weights, respectively), trumpeters—rails plus finfoots (1210, 30629), and trumpe- 
ters—rails (1214, 30437) (Aptornis excluded for simplicity). 

Our data support the monophyly of finfoots and their position in the clade of 
raillike birds, rather than with limpkin among the cranes as suggested by Sibley and 
Ahlquist (1985, 1990). But trumpeter is important for their placement near rails. 
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FIGURE 5.8  Neighbor-joining bootstrap tree of Gruiformes mitochondrial 125 rDNA, domains I- 
III, obtained with 7.3:1 transversion weighting and two-parameter distances (Kimura, 1980) using 
PHYLIP 3.5 (Felsenstein, 1989). All bootstrap values >50% are labeled (100 replicates). Branch lengths 
are calculated by least-squares fitting of distances to bootstrap tree. Empirical base frequencies used are 
as follows: A, 0.32033; C, 0.28244; G, 0.19667; T, 0.20056. 


Finfoots group with raillike birds, except (1) when both trumpeter is removed and 
Aptornis is included, with or without transversion weighting (in which case they 
group with limpkin, sister to cranes), or (2) when gaps are weighted (in which case 
they group with roatelo, sister to cranes plus limpkin). 

Unweighted trees placing Aptornis with kagu are 27 unordered steps longer 
(kagu positioned as in Fig. 5.6) or 102 steps longer (kagu positioned as in Cracraft’s 
Psophii) than a grouping of Aptornis with rails. Bootstrap support for a sunbittern— 
Кари clade exclusive of Aptornis is 95-- 100% in virtually every tree examined. In а 
6-taxon tree (Aptornis, Eurypyga, Gallinula, Gallus, Laterallus, Rhynochetos; 7.3:1 
transversion parsimony; 1000 replicates) of only bases 200—920 for which the 
Aptornis sequence is complete, bootstrap support is 99% for rails, 99% for rails plus 
Aptornis, and 98% for sunbittern-kagu (Fig. 5.10). The decay index for the Aptor- 
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FIGURE 5.9 Consensus 9-taxon parsimony bookstrap tree of Anthropoides, Grus, Gallinula, Laterallus, 
Eurypyga, Rhynochetos, Gallus, and any 2 of Balearica and/or Aramus and /or Psophia (1000 replicates, heu- 
ristic parsimony with 7.3:1 transversion weighting of mitochondrial 12S rDNA, domains 1-1, using 
PAUP 3.1.1; Swofford, 1991). Bootstrap values in excess of >90% in support of a cranes~rails clade, 
exclusive of sunbittern-kagu, are labeled. The Psophii of Cracraft are polyphyletic. Cranes and rails do 
not form a fundamental dichotomy in Gruiformes, as is widely presumed (compare to Figs. 5.2 and 5.3). 


nis—rails node is 6% in this tree (expressed as a percentage to account for 7.3:1 
transversion weighting). Although optimal transversion parsimony trees indicate a 
sister relationship between Aptornis and finfoots in the full data set, bootstrap sup- 
port for this grouping is always = 82%. 

Bustard, seriema, roatelo, and hemipode tend to group in consistent positions, 
but none of the positions are well supported and some weighting regimes produce 


Gallus 
Eurypyga 
98 Rhynochetos 
Aptornis 
99 Gallinula 
99 Laterallus 


FIGURE 5.10  Parsimony bootstrap tree showing the relationship of Aptornis to rails, not kagu (1000 
replicates heuristic parsimony with 7.3:1 transversion weighting, sites 200—920 of mitochondrial 12S 
rDNA, domains I-III, for which Aptornis sequence is complete, using PAUP 3.1.1; Swofford, 1991; 
compare to Fig. 5.2). Bootstrap values are labeled. 
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FIGURE 5.11 Parsimony bootstrap tree suggesting a relationship of bustards and seriemas (1000 rep- 
licates heuristic parsimony with 7.3:1 transversion weighting of mitochondrial 125 rDNA, domains I- 
III, using PAUP 3.1.1; Swofford, 1991; compare to Fig. 5.2). Bootstrap values are labeled. 


different results. Most consistent is a tendency for seriema and bustard to form a 
clade that is sister to the cranes plus raillike birds. In a 9-taxon tree (Anthropoides, 
Grus, Laterallus, Gallinula, Ardeotis, Cariama, Eurypyga, Rhynochetos, Gallus; 7.3:1 
transversion parsimony; 1000 replicates) bootstrap support for the bustard—seriema 
clade is 88% and its sister relation to cranes—rails is 92% (Fig. 5.11). 

Seriema, bustard, roatelo, all Charadriiformes, heron, and ibis form a clade when 
all outgroups are included in transversion parsimony. Relaxation of weighting re- 
moves half of the Charadriiformes and both Ciconiiformes from this clade. None 
of these nodes receive strong bootstrap support. 

Hemipode is sister to all Gruiformes except sunbittern—kagu. In an 8-taxon tree 
(Anthropoides, Grus, Laterallus, Gallinula, Turnix, Eurypyga, Rhynochetos, Gallus; 7.3:1 
transversion parsimony; 1000 replicates), bootstrap support for a sister relationship 
of hemipode to cranes plus rails is 95% (Fig. 5.12). 

The shortest MP tree without transversion weighting includes roatelo as a mem- 
ber of the raillike clade, sister to trumpeter. This, however, is only 2—10 steps 
shorter than 4 dissimilar trees, grouping roatelo with seriema and/or bustard or 
hemipode. Roatelo has a strong tendency to cluster with Charadriiformes (Larus, 
Она, Calidris) and ibis (Phimosus), and always groups with gull-murre when they 
are included, regardless of weighting regime. The highest bootstrap values we ob- 
tained in support of roatelo—Charadriiformes is 87% (all taxa; 8:1 transversion par- 
simony, gaps weighted 4, inverse position weighting; 100 replicates). 


B. Discussion 


Several groups are consistently supported by our analyses, in spite of what may first 
appear to be conflicting results from different methods of reconstruction and low 
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FIGURE 5.12  Parsimony bootstrap tree showing the relationship of hemipodes among Gruiformes 
(1000 replicates heuristic parsimony with 7.3:1 transversion weighting of mitochondrial 12S rDNA, do- 
mains I-III, using PAUP 3.1.1; Swofford, 1991; compare to Fig. 5.3). Bootstrap values are labeled. 


bootstrap support. Many of the nodes with low bootstrap support in the complete 
data set receive high support with fewer taxa. While the complete phylogeny 
(Fig. 5.6) is not robust, it agrees with well-supported trimmed trees (Figs. 5.9- 12). 
Questions posed in Section I,A are addressed in order here. 


1. Are traditionally recognized Gruiformes monophyletic? Taken at face 
value, our data suggest that Gruiformes are not monophyletic. Kagu and sun- 
bittern always assume a position more basal than putative outgroups, except 
chicken. Bustards, seriemas, and roatelos form a clade with Charadriiformes, so 
Gruiformes may be paraphyletic. This accords with the description by Olson 
(1985) of the osteology of bustards as resembling glareolid Charadriiformes 
and the inability of Sibley et al. (1993) to consistently separate Gruiformes from 
Charadriiformes. We did not address the monophyly of Charadriiformes in 
our study. 

Hemipodes are not distant outgroups to Gruiformes as proposed by Sibley 
and Ahlquist (1990), but stone curlew (Burhinus) appears to be closely related 
to them in the absence of transformation weighting. 

Seriema never groups with secretary bird, as suggested by Verheyen (1957). 
Instead, they appear to be ecologically, behaviorally, and morphologically 
convergent. 

Neither sunbittern nor roatelos group with herons, as suggested by Olson 
(1979, 1985). However, roatelos are the most problematic taxon studied. Their 
position is unstable and poorly supported in our reconstructions. This is unfor- 
tunate since morphology has thus far provided little guidance on their relation- 
ships and we provide their first molecular data here. Preliminary observations 
without formal character analysis reveal no compelling morphological synapo- 
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morphies to unite roatelos with either Charadriiformes or Gruiformes (P. Houde, 
personal observation). Their alliance with Charadriiformes among Gruiformes 
in our gene phylogeny may be spurious. 

Roatelos have probably survived a long period of isolation in Madagascar. 
"Their pairwise genetic distances to ingroups are larger than for any other pair 
of taxa, even exceeding all other chicken-Gruiformes distances. Thus, roatelos 
either represent an ancient lineage, unrelated to Gruiformes, or an unusually 
rapid rate of evolution has erased any evidence of that relationship in their 12S 
rDNA. Ifthe latter, then this is quite the opposite ofthe evolutionary slow-down 
of some other insular Malagasy endemics (Bonner et al., 1981). 

2. Apart from roatelos, does the first branch in Gruiformes separate all rail- 
like birds from all cranelike birds, as in the subordinal classifications of Cracraft 
(1982) and Sibley and Ahlquist (1990)? No, all methods of reconstruction strongly 
support a clade that includes cranes and raillike birds to the exclusion of serie- 
mas, hemipodes, kagu, and sunbittern, and probably roatelos and bustards. The 
cranes and raillike birds appear to be a monophyletic group. 

3. Are the Psophii of Cracraft monophyletic? Clearly not, as trumpeters and 
Aptornis are among the cranes-—rails clade, seriemas appear to be sisters to bus- 
tards, and kagu and sunbittern are far removed from all others. Reconstructions 
showing monophyly of the Psophii of Cracraft are 60 unordered steps longer 
than optimal trees. While Aptornis is closely related to rails, the same is only weakly 
supported for trumpeters. Our result explains the difficulty people have had in 
distinguishing the position of trumpeters relative to rails and cranes. Trumpet- 
ers truly are intermediate between the other two. Olson's (1973) analogy of 
trumpeters to primitive rails agrees well with our result (paraphyly of trumpet- 
ers is not implied). 

4. Are finfoots monophyletic, and are they more closely related to the limp- 
kin or to rails? Finfoots are monophyletic and group among raillike birds. This 
is corroborated by our own DNA hybridization experiments and cladistic analy- 
sis of morphology (Houde, 1994; Houde et al., 1995). Although only weakly 
upheld by the data set at hand, this consensus of our findings should dispel any 
lingering acceptance of the limpkin-sungrebe clade proposed by Sibley and 
Ahlquist (1985, 1990). The fact that they are as close as they are is a surprise, in- 
deed, and underscores the previously unrecognized phyletic proximity of crane- 
like and raillike birds. 

5. Are sunbittern and kagu sister taxa? Yes, this relationship is supported 
strongly by all analyses. Kagu-like fossils from the early Eocene of Wyoming 
and middle Eocene of Germany (Hesse, 1988, 1992) place peculiar Amazonian 
and New Caledonian distributions of these monotypic families into perspec- 
tive. The distributions of fossils and neotaxa suggest a fairly cosmopolitan pan- 
tropical-temperate distribution of “eurypygoids” in the warm forests of the 
early Tertiary. The modern kagu and sunbittern appear to be relicts of this ra- 
diation surviving in the most isolated forest refugia. 
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6. Is the fossil Aptornis more closely related to the kagu or to rails? There is 
no question that Aptornis is much closer to rails than to kagu. The original de- 
scription by Parker (1866) of the skull of Aptornis in comparison to trumpeters 
was remarkably insightful (considering trumpeters to be like primitive rails, as 
by Olson, 1973). The phylogenetic informativeness of postcranial morphologi- 
cal characters is all but obliterated by gigantism and the shift of locomotory de- 
pendence from the wings to the legs in this flightless bird. 


Many of our results conflict markedly with the DNA hybridization studies of 
Sibley and Ahlquist (1985, 1990). The 12S rDNA sequences support a cranes—rails 
clade and a sunbittern—kagu clade, and a sister relationship of trumpeters and fin- 
foots to rails. Sibley and Ahlquist proposed that all other Gruiformes except hemi- 
podes are closer to cranes than rails are to cranes, that kagu is closer to everything 
except rails and hemipodes than it is to sunbittern, and that trumpeter is sister to 
cranes (Fig. 5.3). We also disagree with their placement of hemipodes far from 
Gruiformes. They presented a dendrogram that supports the sister relationship of 
bustards and seriemas that we found, but they did not speculate on its veracity (Sib- 
ley and Ahlquist, 1990: Fig. 335). 


IV. MOLECULAR EVOLUTION 
A. Sequence Divergence 


Pairwise Kimura distances (Table II) provide a rough guide to relative amounts of 
sequence divergence in the 12S rDNA of Gruiformes, although they cannot be 
considered uniformly proportional to divergence times. Temporal calibrations of 
sequence divergence apply neither across taxa nor across genes, and perhaps not 
through time itself (Ayala, 1986; Britten, 1986; Sheldon, 1987; Mindell et al., 1996). 
Saturation of sequence divergence by multiple hits on individual sites is a convinc- 
ing mechanism for compression of genetic distance relative to time (Mindell and 
Honeycutt, 1990). Opposite effects could hypothetically result from other as yet 
poorly characterized factors, including pervasive environmental mutagens, aspects 
of the natural history and population structure ofa species, phyletic radiation, lateral 
gene transfer by viruses, or genome transfer by hybridization. 

All sites are not equally available to substitution, and the choices of which are 
used and how they are weighted significantly alter divergence estimates (Pesole 
et al., 1992). For this reason, we present the highest (variable sites only, 7.3: 1 trans- 
version weighting) and lowest (all sites, no weighting) pairwise distances (Table П), 
rather than intermediate distances (i.e., all sites with weighting and variable sites 
without weighting). Although the number of sites unavailable to vary may be over- 
estimated by considering only those that are observed to vary, distances based on all 
sites definitely underestimate invariant sites. Moreover, the divergence of chicken 
from Gruiformes is ancient (e.g., probably Cretaceous), so most sites available for 
variation should show it. 


TABLE П Kimura Two-Parameter Distance Matrix’ 





Gallus 
Anthropoides 
Grus 
Balearica 
Aramus 
Psophia 
Rallus 
Gallinula 
Laterallus 
Aptornis" 
Heliornis 
Podica" 
Mesitornis 
Ardeotis 
Cariama 
Eurypyga 
Rhynochetos" 


Turnix 


0.127 
0.136 
0.123 
0.124 
0.159 
0.166 
0.165 
0.172 
0.157 
0.170 
0.127 
0.181 
0.153 
0.149 
0.138 
0.178 
0.163 


0.464 
0.023 
0.075 
0.073 
0.112 
0.131 
0.116 
0.141 
0.131 
0.136 
0.118 
0.171 
0.103 
0.130 
0.131 
0.171 
0.117 


0.493 
0.058 
0.080 
0.071 
0.113 
0.126 
0.109 
0.129 
0.121 
0.137 
0.111 
0.180 
0.108 
0.133 
0.136 
0.181 
0.127 


0.452 
0.222 
0.233 
0.077 
0.127 
0.124 
0.126 
0.134 
0.125 
0.134 
0.100 
0.179 
0.127 
0.125 
0.140 
0.201 
0.148 


0.451 
0.212 
0.204 
0.231 
0.119 
0.120 
0.106 
0.127 
0.116 
0.118 
0.103 
0.163 
0.102 
0.124 
0.136 
0.190 
0.126 


0.658 
0.364 
0.360 
0.441 
0.396 
0.145 
0.140 
0.166 
0.162 
0.162 
0.140 
0.180 
0.149 
0.184 
0.157 
0.199 
0.164 


0.720 
0.465 
0.435 
0.440 
0.416 
0.528 


.140 


0.153 


0.1 


14 


0.177 
0.149 


0. 
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0.150 
0.209 
0.144 
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0.714 
0.396 
0.356 
0.442 
0.352 
0.502 
0.355 
0.093 
0.116 
0.146 
0.097 
0.190 
0.137 
0.145 
0.144 
0.200 
0.167 


0.841 
0.535 
0.464 
0.507 
0.465 
0.678 
0.328 
0.294 
0.137 
0.146 
0.090 
0.185 
0.148 
0.155 
0.162 
0.228 
0.189 


0.571 
0.411 
0.377 
0.394 
0.357 
0.566 
0.468 
0.364 
0.472 
0.153 
0.093 
0.194 
0.133 
0.149 
0.151 
0.225 
0.175 


0.764 
0.514 
0.505 
0.514 
0.420 
0.675 
0.650 
0.594 
0.624 
0.547 
0.089 
0.201 
0.141 
0.155 
0.172 
0.217 
0.178 


0.529 
0.435 
0.401 
0.362 
0.359 
0.566 
0.418 
0.351 
0.327 
0.319 
0.287 
0.166 
0.122 
0.149 
0.163 
0.208 
0.193 


“Above diagonal: 7.3:1 transversion weighting, variable sites only; below diagonal: unweighted, all sites included. 
* Missing data. 


0.835 
0.703 
0.730 
0.780 
0.655 
0.783 
0.825 
0.914 
0.933 
0.788 
1.114 
0.774 


0.176 
0.184 
0.198 
0.201 
0.193 


0.604 
0.331 
0.346 
0.444 
0.330 
0.553 
0.591 
0.518 
0.595 
0.439 
0.549 
0.466 
0.750 


0.124 
0.134 
0.181 
0.152 


0.586 
0.453 
0.458 
0.438 
0.433 
0.782 
0.634 
0.566 
0.639 
0.513 
0.649 
0.644 
0.849 
0.424 
0.160 
0.177 
0.169 


0.529 
0.477 
0.484 
0.527 
0.496 
0.615 
0.596 
0.564 
0.691 
0.554 
0.781 
0.780 
1.011 
0.502 
0.664 
0.140 
0.161 


0.742 
0.678 
0.739 
0.936 
0.788 
0.893 
0.969 
1.025 
1.221 
0.982 
1.168 
1.381 
0.884 
0.709 
0.690 
0.452 


0.217 


0.677 
0.386 
0.421 
0.537 
0.428 
0.635 
0.545 
0.682 
0.881 
0.614 
0.819 
0.949 
0.852 
0.581 
0.687 
0.653 
0.907 
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Statistically significant differences in evolutionary rates among Gruiformes are 
identified by ranking unweighted two-parameter distances within monophyletic 
groups, summed across all outgroups (“multiple comparisons for ranked data in 
randomized block” of Zar, 1974; “SNK” of Houde, 1987). Although not signifi- 
cant, among the cranes genetic distances involving Grus appear low while those 
of crowned crane appear high, in accordance with the observations of Krajewski 
(1989). Limpkin exhibits significantly lower distances than crowned crane (а = 
0.05; 14 outgroups). Accordingly, in a 6-taxon tree rooted to chicken, trumpeter 
undergoes 57 substitutions, limpkin only 32, crowned crane 55, and the two gruine 
cranes have 53 and 54. The vastly different branch lengths within the cranes— 
limpkin clade may complicate the recovery of the expected topology of limpkin 
sister to cranes vs sister to crowned crane or outgroups (Nei, 1991; Huelsenbeck 
and Hillis, 1993). 

Rail and sungrebe distances appear high, although not so much as trumpeter. 
Among rails, Gallinula distances are significantly lower than Laterallus (а = 0.001) 
and Rallus (a = 0.05; 15 outgroups). No other robust relative rate tests are justifiable 
because of missing data in the only other clades for which independent evidence of 
monophyly is available. However, roatelo has the highest distance to chicken of all 
Gruiformes. 

Neighbor-joining places trumpeter with cranes but MP and ML include it with 
the rails, so we tested its distances using both phylogenies. When included in the 
cranes clade, trumpeter distances are observed to be significantly higher than all the 
others (а = 0.001; 13 outgroups). Trumpeter distances are not different than any 
rails when included in that clade (а = 0.05; 10 outgroups). 


B. Evolutionary Dynamics of 12S rDNA 


One cannot help but be struck by the conservation of both sequence and secondary 
structure between rDNAs of such disparate groups as bacteria and vertebrates (Van 
de Peer et al., 1994). Yet, on this broad evolutionary scale one also appreciates the 
elongation, shortening, and complete loss of some stems. The small-scale events that 
lead to such large-scale patterns require comparison of the genes in both relatively 
closely and distantly related organisms (Kjer, 1995; Hickson et al., 1996). Here, we 
describe some of the small-scale variation in the 128 rDNA of Gruiformes. The 
variation we note has negative implications for the general practice of matching 
sequences to structural maps of rDNA constructed from unrelated organisms. Our 
stem nomenclature follows Van de Peer et al. (1994). 


1. Stem Migration 


We noted movements of complementary bases upstream and downstream within 
stems. This "stem migration" seems to result more from substitutions that affect 
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complementarity than from insertions and deletions. It may result in extended or 
reduced base complementarity, but usually involves the migration of one or both 
sides of the stem region. These discrepancies between taxa are reflected in the ir- 
regularly boxed stems 8 and 26 of Fig. 5.4. 

Stem 8 is the most variable (Figs. 5.4 and 5.13). Its distal segment is fanked on 
both sides by single-stranded loops of variable length. Similarities in sequence and 
stem position are apparent among closely related taxa, but identification of homolo- 
gous sites is difficult across all Gruiformes. The stem consists of five pairs of com- 
plementary bases in all taxa except the hemipode (one mismatch) and trumpeter 
(which seems to have six). The upstream side of the stem in rails has a pattern of 
YYCCT, seriema and bustard have CCTTA, limpkin and crowned crane have 
CCTAR, and gruine cranes have CCTAT. The pattern in trumpeter (GCCTAC) is 
the same as in gruines (i.e., CCTAY), except an additional purine is added to the 5’ 
end. A 5’ purine is otherwise a uniquely shared character of the stems of sunbittern 
and kagu (RCCTT). 

Alignment of the upstream CCT of stem 8 in all taxa invokes considerable stem 
migration. Phylogentic weighting produces an alternate alignment that is both more 
parsimonious and invokes less stem migration. On the downstream side of stem 8 
most taxa have a homologous sequence of AGG (i.e., aligned), complementing the 
upstream CCT (minor differences in hemipodes and roatelos). The stem of sunbit- 
tern and seriemas (kagu data missing) migrates 1 base downstream, while that of 
rails migrates 2 bases downstream relative to the AGG sequence and the stem of 
other Gruiformes. 

In roatelo, stem 24 migrates upstream 1 base on the downstream side of the stem 
by transition substitutions. In crowned crane, roatelo, and sunbittern, stem 26 ap- 
pears to migrate upstream 1 base on the upstream side of the stem, and upstream 2 
bases on the downstream side of the stem. It is difficult to infer the location of 
stem 26, however, because it consists of only two pairs of bases. This is reflected 
in the ambiguity of Fig. 5.4, in which the stem is shown to overlap stem 24. Hy- 
pothetically, such overlap could accurately reflect temporally alternate or tertiary 
structural interactions as in stem 22, but there is no independent evidence for such 
phenomena here. 

All the birds examined here differ from mammals in stems 24, 27, and 47. The 
entirety of the avian stems 24 and 27 migrate one position proximal compared 
with mammals (1.е., upstream on upstream side, downstream on downstream side). 
Stem 47 is inferred to have elongated distally relative to mammals (i.e., downstream 
on upstream side, upstream on downstream side). 


2. Compensatory Substitutions 


We made anecdotal notes on frequency of compensatory substitutions while count- 
ing putative synapomorphies for clades of interest. Unlike insertions and deletions 
within stems (Section IV,B,3), most substitutions within stems precipitate compen- 
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GCCCCCaaacct----- 
GCCCTCggac------- 
GCCCTCggac------- 
GCCCTCaatc------- 
GCCCTCaacc------- 


GCCCaCgact------- 


GCCCTCagc-------- 
GCCCa-agccc------ 


GCCCaCagcc------ GCCTTa 
GCCCCT-gtc------ ACCTTt 
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CCTAT----ctcttATAGGca-aGA-G 
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CCTAA----caatcTTAGGCca--AA-G 
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CCCtCttcacttatGcGGGaa--AA-G 


ЕСССЕВгуһус..... RYXYHNhhnwcwhhhNHBGRNNcrVRCG 


8 8 8 


730 740 750 
CtcaaTAGCC----------- cctcGCTAa 
CacaaTAGCcccc------ g--cccGCTAÀAa 
CacaaCAGCccccc----- a--cccGCTAa 
CacaaTAGCccctcccc--a--cgcGCTAa 
CacaaCAGCcc------- aa--cccGCTAg 
CttaaTAGCcc------ aag--cccGCTAg 
CacaaCAAC---------- a-tcccGCTAa 
CacaaTAGC---------- a-tcccGCTAa 
CacaaCAGC---------- a-tgccGCTAa 
CacaaCAGCc--------- a--cccGCTAa 
CtcaaCAGCc--------- a--cccGCTAg 
СасааТАССс--------- a--cccGCTAa 
CataaCAGCc-------- ag--tttGCTAa 
CataaTAGCccc--------- tcccGCTAg 
CacaaTAGCccc---------- CgcGCTAa 
CaaaaTAGCct-------- a-cctcGCTAa 
CaaaaTAGCtc----------- ttcGCTAa 
CacaaTAGTtt------- aa--cccACTAg 
CwhaaYARYyycyccccaar.ybbyRCTAr 
39 42 42 


830 840 


cgaaa-AAGGATGtgaa 
cggaa-GGGGGtGtgaa 
cggaa-GGGGGtGtgaa 
cgaaa-AGGGGACtgaa 
садаа-СССССЕСЕдаа 
cgaaa-GGGGGCatgaa 
cgaaa-AGAGGCCtgaa 
cgaaa-AGGGGCCtgaa 
cgaaa-AGAGATCtgaa 
cgaaa-GAGGAAAtgaa 
cggaa-AGGGAAAtgaa 
cgaaa-AGGGAGAtgaa 
cggaa-AAGGGTAtgaa 
cgaaa-GGGAGTAtgaa 
cgaaa-GCGGGTGtgaa 
cggaacAGGGGCAtgaa 
cggagc-GGGGTGtgaa 
cggaa-GAAAGTGtgaa 
cgrarcRVRRRNVtgaa 
47 


FIGURE 5.13 Partial sequence alignment of Gruiformes mitochondrial 12S rDNA. The consensus sequence is for Gruiformes only, 


but gaps in consensus sequence are shown to permit alignment with outgroups. Stems are underlined and labeled. Upper-case lettering, 


canonical pair in stem; upper-case italics, noncanonical pair in stem; lower-case, nonpairing base. Positions 77— 120 illustrate migration 
of stem 8 and an uncompensated insertion within the stem at position 119 in Posphia. Positions 721—750 illustrate replication slippage 
in the distal loop of stem 42 as a mechanism for sequence length variation. Positions 824—840 illustrate shortening of stem 47 in 
Rhynochetos by an uncompensated deletion at position 830. 
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satory substitutions on the complementary side. Of those that we tracked, no fewer 
than 64% of compensations that occur do so within the period of one internode in 
the phylogeny reconstruction. Some that are located in the inferred distal exten- 
sion of stem 47 remain uncomplemented over several internode periods, but such 
delays are not unique to this stem. Noncanonical pairing or nonpairing may be 
favored in these instances, or they may reflect our inability to detect all substitutions 
in the depths of the tree. 

We may have observed an early event in the process of compensatory substitu- 
tion. This involves an apparent example of heteroplasmy in Rallus. Both C and T 
bands occur at site 372 in an otherwise clear and unambiguous sequence autoradio- 
graph. This site is in a complementary position to site 402 in stem 24, where Rallus 
is unique in possessing a transversion substitution from A to T. The C represents 
a transition substitution from the T state at site 372. Neither C nor T comple- 
ment with site 402. We speculate that the heteroplasmy represents temporary relax- 
ation of selection for sequence conservation related to the process of compensation. 
Stated differently, all noncomplementary bases may be tolerated equally, transitions 
would likely be the first form of variants to appear, and complementary bases would 
be selectively favored when they arise, eventually stabilizing the sequence. 


3. Insertions and Deletions 


We noted uncompensated insertions and deletions in stems that create or eliminate 
bulges (Figs. 5.4 and 5.13). Crowned crane and limpkin share a deletion ofa 1-base 
bulge on downstream side of stem 3 (position 315). Trumpeter possesses a 1-base 
uncomplemented addition in stem 8 (position 119). Kagu exhibits a unique 1-base 
uncompensated deletion on the first base of the upstream side of stem 47 (posi- 
tion 830). All three rails exhibit a synapomorphic 1-base uncompensated deletion 
on the downstream side of stem 48 (position 902). 

Replication slippage may accelerate length variation in polynucleotides, and di- 
and trinucleotide repeats in a variety of systems (Tautz et al., 1986; Hancock and 
Dover, 1990; Weston-Hafer and Berg, 1991; Degoul et al., 1991; Wolfson et al., 
1991). The terminal loop of stem 42 varies from 5 to 12 bases in Gruiformes 
(Figs. 5.4 and 5.13). Cranes synapomorphically possess the most bases here. The 
loop begins with a variable number of pyrimidines (up to eight; almost all Cs) that 
immediately follow a C in the stem. The stem of hemipode ends with a T, and its 
loop begins with poly(T) instead of poly(C). The next segment of the loop includes 
up to three purines (mostly As), followed by a variable number of bases of which 
most are pyrimidines, especially Cs. This appears to be an example of length varia- 
tion by slippage. Slippage seems to occur more readily than transitions, which in 
turn exceed transversions. Accordingly, crowned crane is the only member of the 
cranelike birds to exhibit a substitution among the 3- to 9-base poly(C) of the first 
segment, and it is a transition. Replication slippage may also be involved in the 
length variation at positions 83-107, 140—145, 659—660, and 817-820. 
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After a long history of substitutions has overwritten the earmarks of replication 
slippage, it may be impossible to distinguish chance similarities from homology. In 
variable-length regions, it may be prudent not to attempt to force sequences into 
alignment in the interest of phylogenetic inference. 


4. Among-Site Rate Heterogeneity 


Site variability, or substitution rate per site, differs profoundly between higher taxo- 
nomic groups. This is somewhat surprising in light of the sequence conservation 
between bacteria and vertebrates at some positions. Sullivan et al. (1995) identified 
29 most-variable sites in their study of 12S rDNA in sigmodontine rodents, a group 
that is considerably less diverged among themselves than are Gruiformes. While the 
maximum number of changes per site they observed was three, we infer sites to 
have changed as many as eight times in our phylogeny of Gruiformes. Of 21 most- 
variable sigmodontine sites of which we could determine the homology in Grui- 
formes, 8 are invariant in Gruiformes, 3 changed only once, 3 twice, 3 three times, 
3 four times, and 1 changed six times. This is clearly a different distribution of 
among-site rate variability between the two groups. Thus, the evolutionary dynam- 
ics of 125 rDNA may differ substantially between birds and mammals, and possibly 
contribute to the higher levels or resolution in some nonavian 12S rDNA recon- 
structions of even deeper divergences than those in Gruiformes (e.g., Cummings 
et al., 1995). 

Phylogenetic informativeness is often thought of as inversely related to the rate 
at which changes occur, and weighting schemes are designed accordingly. But they 
need not be. The lengths of the monotonous polynucleotides in loop 42 may have 
evolved rapidly but they are obviously highly correlated with phylogeny (Sec- 
tion IV,B,3). For example, in the first segment rails have one C residue, finfoots 
have two, trumpeters and limpkin have three, and cranes have five to eight. Further 
similarities are exhibited by sunbittern—kagu—hemipode and seriema—bustard and 
finfoot—Aptornis—roatelo. 


C. Character Evolution of 125 rDNA 


Many regard nucleotide substitutions as stochastic or effectively neutral within the 
recognized constraints of positional variation in evolutionary rates and differences 
in rates between evolutionary lineages. By limiting consideration of frequency of 
nucleotide substitution merely to positional effects of evolutionary rate, one may 
overlook the ways in which nucleotide subsitutions may be functionally correlated 
or adaptively specialized (Margoliash et al., 1976; Hancock and Dover, 1990; Irwin 
et al., 1991; Gillespie, 1991; Ma et al., 1993). 

Positions of synapomorphously and homoplasiously shared derived characters of 
individual clades may be identified on a structural map of the gene, and tested for 
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fit to models of expected distributions of nucleotide substitutions. Derived charac- 
ters that are concentrated in functional or structural regions ofa gene and that differ 
significanly from expected distributions of nucleotide substitutions might represent 
adaptive specializations. 


1. Expected Probability of Substitution 


The expected probability of substitution for any site is estimated by dividing the 
number of times that site changed by the sum of all nucleotide substitutions at all 
sites in a most-parsimonious phylogeny of many taxa. Expected substitution fre- 
quencies for a functional or structural region is simply the sum of expected substi- 
tution probabilities of all sites included in a region that is otherwise defined by 
independent criteria (e.g., protein-binding domain or secondary structure). 

We partitioned the 12S rDNA molecule into large regions on the basis of protein 
binding, and small regions corresponding to stem/loop structure. The large regions 
include sites 339—540 in domain II, which broadly encompass the binding domain 
of ribosomal proteins S6+ 18, and sites 610—670 and 770—930 in domain III, which 
collectively include the binding domains of proteins 57 and $19 (Noller et al., 1990; 
Ehresmann et al., 1990). The small regions include positions 339—452 and 796-861 
within the larger sets. 

We summed all nucleotide substitutions (from 0 to 8 per site; Section ILA,5; 
Table I) in each these regions in the most parsimonious phylogeny and divided those 
sums by the total number of nucleotide substitutions in the phylogeny (и = 1110) to 
calculate the percent of substitutions detected in each region, our estimate of ex- 
pected substitution probabilities for each region. Across all Gruiformes, only 21.596 
of all sites but 34.3% of all substitutions occur in large region domain II, 27.8% of 
sites but 26.5% of substitutions in large region domain III, 12.1% of sites but 23.0% 
of substitutions in small region domain II, and 7% of sites but 10.0% of substitutions 
in small region domain IH. The expected probability of substitution is greater in 
domain II than in domain III. 


2. Distribution of Synapomorphies 


Casual observation suggested to us that substitution frequencies were different in 
some clades than the whole of Gruiformes. We used chi-square (Zar, 1974) to test 
whether the relative frequencies of substitutions (i.e., synapomorphies) on indi- 
vidual branches differed significantly from the whole phylogeny in each of these 
regions (Section IV,C,1; substitutions defining the clade were subtracted from the 
whole to ensure independence of sets being compared). We performed this test on 
both the branches uniting trumpeter with raillike birds and uniting trumpeter with 
cranelike birds, since our data support the former sister relationship only weakly. Par- 
simony bootstrap values support inclusion of trumpeters in the cranes-rails group, 
but does not significantly favor a grouping with either cranes or rails (Fig. 5.9). 
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FIGURE 5.14 Putative synapomorphies (large dots) in mitochondrial 125 rDNA, domains I-III, of 
trumpeter (Psophia) with mutually exclusive hypothetical sister taxa cranelike birds (left) and raillike birds 
(right). (A) Cranes and limpkin; (B) rails, finfoots, and Aptornis; (C) trumpeter. Protein-binding domains 
of 56+ 18, and S7 and 519, are broadly circumscribed by boxes. The distribution of substitutions is 
significantly different from expected frequencies in boxed regions and between the two hypotheses of 
phylogeny. See text for details. 


Raillike birds (“rails” below) consisted of a star phylogeny for Aptornis, Gallinula, 
Heliornis, Laterallus, Podica, Rallus. Cranelike birds (“cranes” below) consisted of a 
star phylogeny for Anthropoides, Balearica, Aramus, and these were rooted by a star 
clade of Ardeotis, Cariama, Mesitornis, Turnix, sisters sunbittern—kagu and finally 
chicken. Twenty-five putative synapomorphies unite trumpeter with "cranes" and 
18 are shared by trumpeter and “rails” (Fig. 5.14). We did not consider a phylogeny 
with trumpeters as sister to both “cranes” and “rails” because this parsimony recon- 
struction was 2.2% longer than the others. 

Significant differences from expected frequencies of substitutions were appar- 
ent in the large region of domain II in the branch uniting trumpeters—“rails,” and 
in all of the small regions for both trumpeters- "rails" and trumpeters- "cranes" 
(p < 0.025). Eliminating compensatory substitutions from consideration does not 
change the results. Thus the evolutionary rate in some functional and structural 
regions differs significantly between some gruiform clades, as both phylogenies gave 
significant results. 


3. Distribution of Synapomorphy versus Homoplasy 


Patterns of molecular synapomorphy and homoplasy can be elucidated by compar- 
ing mutually exclusive sets of putative synapomorphies indicated by alternate hy- 
potheses of phylogeny for a single taxon (C) relative to its candidate sisters (A and 
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B; Fig. 5.14). The putative synapomorphies of C to each of clades A and B are 
mutually exclusive as long as A and B are sisters to one another (exclusive of C). 

We tested for differences in frequencies of putative synapomorphies in func- 
tional and structural regions between the mutually exclusive hypothetical clades, 
trumpeters— "rails" and trumpeters- "cranes" (Fig. 5.14). This asks whether trum- 
peters—- "cranes" and trumpeters- "rails" tree topologies affect the positions of pu- 
tative synapomorphies differently. Using a chi-square contingency table with Yates 
correction (Yates, 1934; Zar, 1974), we found that the two phylogenies exhibit 
significant differences from one another in the frequencies of synapomorphies in 
the regions examined except for large region, domain II (large region, domain III, 
p = 9.001; small region, domain II, p = 0.025; small region, domain III, p = 0.05). 

What value could there be in making a comparison with at least one tree that 
must be erroneous? Minimally, we have shown that synapomorphies and homopla- 
sies are distributed differently; but we believe we have done more. Because (1) sets 
of synapomorphies defining two mutually exclusive hypothetical clades must in re- 
ality include at least one set of homoplasies, (2) both sets represent derived states 
(i.e., not plesiomorphies, as determined from outgroups), (3) distributions of those 
sets differ significantly between alternative phylogenies, and (4) those differences 
correspond to functional and structural regions of the gene, both genic divergence 
and convergence are observed to localize in functional and structural regions. This 
demonstrates variation in among-site evolutionary rates between sister taxa that 
could possibly represent (different) adaptive specializations at the molecular level. 

Convergence may be implicated when homoplasiously shared characters coin- 
cide with discrete structural or functional parts ofan organism or a gene, as they are 
in this example. We emphasize a distinction between suites of convergent characters 
and homoplasious noise. The former may be under the influence of a unifying se- 
lective agent, possibly affecting secondary structure and molecular interactions. We 
cannot distinguish whether our observation results from selection or is merely a 
byproduct of other factors. 

There may be explanations for the different groupings of shared characters in the 
trumpeters— "rails" and trumpeters- "cranes" trees that do not invoke adaptive spe- 
cialization. Because the rate of evolution in domain II exceeds that of domain III 
overall (Section IV,C,1), then the higher than expected number of shared characters 
in domain III of trumpeters- "rails" might represent phylogenetic signal while the 
higher than expected number of shared characters in domain II of trumpeters— 
"cranes" might be an attraction of long branches (Felsenstein, 1978). This expla- 
nation may not account for the less than expected numbers of shared characters in 
domain II of trumpeters- "rails" and domain III of trumpeters- "cranes." 

We used Mann- Whitney U (Zar, 1974) to test whether synapomorphies of 
trumpeters- "cranes" (domain П) differ from synapomorphies of trumpeters— 
"rails" (domain Ш) in evolutionary rate (i.e., number of substitutions per site; 
Table I; both large and small regions were tested). This differs from the observation 
that domain II has a higher expected substitution probability overall than domain III 
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because it examines the evolutionary rate on a site-by-site basis. We also tested 
whether all synapomorphies of trumpeters- "cranes" differed from all synapomor- 
phies of trumpeters—- "rails" in evolutionary rate. No test detected significant dif- 
ferences in substitution rates between sites defining the trumpeters- "cranes" and 
trumpeters- "rails" clades (p > 0.2). This suggests that the difference in distri- 
butions of synapomorphies and homoplasies we observed cannot be attributed 
simply to rate differences among sites. 


V. IMPLICATIONS OF 128 EVOLUTION FOR 
PHYLOGENETIC INFERENCE 


The lack ofa clear resolution of many aspects of gruiform phylogeny from 12S rDNA 
is disappointing. 12S rDNA strongly supports relationships of some recently di- 
verged taxa, but not of more distant taxa. Even some traditionally accepted family 
groups do not receive robust support from our data. What is the cause for the lack 
of phylogenetic resolution in these data? 

12S rDNA is not entirely saturated by homoplasious substitutions at the levels of 
gruiform divergence because it performs well at resolving much older divergences. 
Yet it exhibits sufficient noise to hinder resolution of a phylogeny that may be char- 
acterized by relatively short internodal branches (Fig. 5.8). Low jackknife values are 
symptomatic of such noise (Fig. 5.6). Our phylogenetic hypotheses must be inter- 
preted cautiously. A gene phylogeny may not always accurately reflect organis- 
mal phylogeny (Avise et al., 1984; Wu, 1991; but see Moore, 1995). Discrepancies 
in phylogenetic reconstructions derived from different genes demonstrate that all 
cannot accurately reflect organismal phylogeny (Felsenstein, 1988; Bremer, 1988; 
Pamilo and Nei, 1988; Wheeler and Honeycutt, 1988; Hendy and Penny, 1989; 
Doyle, 1992; Sanderson and Doyle, 1992). 

We chose 12S rDNA partly because it includes both evolutionary labile and con- 
served regions, and therefore should have a broad window of resolution for address- 
ing recent and ancient divergences. However, among-site variation in evolutionary 
rate, among other factors, proves to impede rather than enhance the successful re- 
covery of gene phylogenies. Phylogeny reconstruction is sensitive to substitution 
bias (Brown et al., 1982; Knight and Mindell, 1993), differences in evolutionary rates 
at the level of the organism (Britten, 1986; Sheldon, 1987), gene (Ayala, 1986) or 
nucleotide position (Milkman and Crawford, 1983), homoplasious evolution (Wil- 
kinson, 1991), composition bias (Collins et al., 1994), and differences in branch 
lengths, tree symmetry, and number of taxa (Nei, 1991; Huelsenbeck and Hillis, 
1993). The factors that influence the reliability of tree-building methods is well 
understood for only simple conditions (Hendy and Penny, 1989; Rohlf et al., 1990; 
Nei, 1991; Navidi and Beckett-Lemus, 1992; Huelsenbeck and Hillis, 1993; Kim 
et al., 1993; Zharkikh and Li, 1993). 

Whatever lack of resolution is symptomatic of 12S rDNA data, it probably does 


5 Phylogeny and Evolution of 128 rDNA in Gruiformes 153 


not accrue from positional rate variation as traditionally perceived because neither 
position weighting nor data partitioning improves tree resolution. Instead, it may 
result from differences in evolutionary rates between taxa and differences in posi- 
tional rates between taxa. The conserved sequences and secondary structures of 
small subunit rDNA shared by prokaryotes and vertebrates belie dramatic differ- 
ences in evolutionary lability of homologous regions between different lineages. 
This is evident at a large scale by the observation that a quarter to a third of the most 
variable sites in rodents are invariant in Gruiformes. It is evident at the small scale 
by significant differences in regional substitution rates between sister clades. High 
rates of substitution, thus, are not confined to particular sites across taxa; they are 
found in different locations in different lineages. 


VI. SUMMARY 


We performed phylogenetic reconstructions using 12S rDNA sequences from rep- 
resentatives of all the currently recognized families of Gruiformes. We found rails 
closer to cranes than many other Gruiformes widely believed to be close to cranes. 
We suggest that trumpeters are closer to rails than to cranes, but suggest that they 
are intermediate between the two. Among a clade of rail relatives are the sungrebe 
and finfoots and the fossil Aptornis. Kagu and sunbittern are each the only close 
relative of the other, and are the most distant of all Gruiformes examined. We make 
several observations and inferences on the evolutionary dynamics and character evo- 
lution of the 125 rDNA molecule, including (1) variation in secondary structure 
resulting from stem migration and uncompensated insertions and deletions within 
stems, (2) replication slippage as a mechanism of sequence length variation in loops, 
(3) differences in per-site substitution rates between birds and mammals, (4) the 
process of compensatory substitution in stems, and (5) differences in distributions 
of synapomorphies and homoplasies that are spatially correlated with functional and 
structural regions of 12S rDNA. A robust but simplified estimate of the instantane- 
ous ratio of rates between transversions to transitions is calculated for the 12S rDNA 
of Gruiformes. 
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I. INTRODUCTION 


There is only one evolutionary history. A quick glance, however, through the cur- 
rent systematics literature will indicate that some troublesome taxa are notable for 
the degree of divergent and incongruent phylogenies associated with them. Incon- 
gruency in phylogenetic reconstruction is troublesome, because it is the outward 
manifestation of intrinsic contradictions in the methodologies or philosophies em- 
ployed. Perhaps the most troublesome taxon in birds at present is the Pelecani- 
formes, a traditional order of pelican-like birds commonly hailed as an exemplar of 
a natural group. 

Pelecaniformes and the other large waterbird orders (i.e., Sphenisciformes, Ga- 
viiformes, Podicipediformes, Procellariiformes, Ciconiiformes) are considered by 
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most systematists to comprise the early branching clades of nonratite birds. Previous 
studies of the biology and higher level relationships among these orders have helped 
influence our concepts about macroevolutionary paterns among birds. Exceedingly 
detailed anatomical and morphological studies from the beginnings of avian syste- 
matics research (e.g., Garrod, 1873; Mivart, 1878; Fürbringer, 1888; Gadow, 1889; 
Beddard, 1899) established the relationships and methods of character analysis and 
interpretation that prevail today. 

Critical assessment of the higher order relationships among nonpasserines is one 
of the most critical issues in avian systematics (Sheldon and Bledsoe, 1993) and a 
crucial problem at present is whether the Pelecaniformes is monophyletic. This is a 
relatively small group (about 70 species in 6 families) of pelican-like waterbirds hav- 
ing a cosmopolitan distribution—they are absent only from deserts and polar re- 
gions—and one that has generated controversy since the beginnings of avian sys- 
tematic research. Molecular studies (Sibley and Ahlquist, 1990; Hedges and Sibley, 
1994; fig. 6.1) suggest that Pelecaniformes is not monophyletic and the former 
members are instead grouped with other early branching groups. If these molecular- 
based studies are validated and Pelecaniformes is dismantled, then the traditional 
morphological framework that forms the basis for most phylogenetic assumptions 
in avian systematics is clearly in need of reexamination as it pertains to modern 
systematics research. Most avian phylogenies at present are based on morphology or 
other traditional characters, but while the hypotheses may be explicit, their empiri- 
cal and philosophical foundations are not. 

Phylogenetic hypotheses are the critical framework for understanding macro- 
evolutionary patterns and interpreting comparative evolutionary data. Quantitative 
comparative analysis of the patterns in molecular, behavioral, and morphological 
evolution requires detailed phylogenies, particularly when character states and 
transformation series are incompletely known. For these reasons, a better under- 
standing of higher order phylogenies is crucial when group monophyly is problem- 
atic and major evolutionary change has occurred. 


A. The Problem of the Putatively 
Privative Pelecaniformes 


Understanding the phylogenetic relationships of the pelecaniform birds, therefore, 
is central to the larger question of understanding the higher level relationships 
among nonpasserine birds. Traditionally, this order comprises six families, e.g., 
Phaethontidae (tropicbirds), Fregatidae (frigatebirds), Pelecanidae (pelicans), Suli- 
dae (gannets), Phalacrocoracidae (cormorants), and Anhingidae (darters). Pelecani- 
form affinities have been established using conventionally distinctive features in- 
cluding totipalmate feet or steganopody (four toes joined by a web), gular pouch 
under the mandible, a prelanding vocalization, and several others. Procellariiform 
birds are considered to be the sister-group to the traditional Pelecaniformes, but 
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FIGURE 6.1 Distance tree for pelecaniforms (taxa marked by asterisk) and relatives inferred from 
DNA sequences of mitochondrial 12S and 16S rRNA genes (1.7 kb total). The tree was contructed by 
neighbor joining (Saitou and Nei, 1987); bootstrap values were determined by 2000 replications. (Re- 
produced with permission from Hedges and Sibley, 1994.) 


there is no agreement among studies on this or on most other aspects of higher 
order relationships among the early branching clades of waterbirds (Cracraft, 1985; 
Olson, 1979; Sibley et al., 1988; Sibley and Monroe, 1990; Hedges and Sibley, 
1994). 

Pelecaniformes, in turn, have been considered by many authorities to be the 
sister-group to herons and storks (Ciconiiformes sensu latu), and since the earliest 
studies systematists have utilized pattern analyses of pelecaniform morphology and 
behavior as insights into the historical and developmental trajectories of avian evo- 
lution. For example, most higher order avian taxa are defined by traditional char- 
acters such as an acarinal sternum in ratites, tubular nostrils in procellariiforms, and 
holorhinal bills in charadriiforms. If these and other traditional characters used for 
grouping prove to be convergent, homoplasious, or otherwise unuseful, then cur- 
rent notions concerning use and stability of traditional characters in systematic 
research are problematic. Furthermore, the issue of pelecaniform monophyly is 
pivotal to the larger issue of higher level relationships among basal waterbirds. If 
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hypotheses concerning the paraphyly of pelecaniform birds are correct, then tradi- 
tional ideas about early patterns of morphological and behavioral evolution in birds 
need to be completely reassessed. This will have great impact on current studies 
underway in later branching clades of birds. Equally important, rearrangement of 
the basal orders will impact future systematic studies of other avian taxa by change 
in the status of putative sister-groups, the patterns of character state evolution, and 
the polarity of transformation series. 

Group membership in the Pelecaniformes has remained contentious from the 
beginning. Many of the dissenting early authorities believed that tropicbirds and/ 
or frigatebirds were inappropriately conjoined with the others; however, most 
agreed that the Sulidae, Phalacrocoracidae, and Anhingidae (the "core Pelecani- 
formes") were closely related. Unfortunately, in the more than 30 systematic studies 
done since Huxley's (1867) treatment of the order, only three have employed repro- 
ducible methodology—and these results are together ambiguous. Cracraft (1981, 
1985) pioneered numerical cladistics for birds and undertook one of the first quan- 
titative analyses in his study on the monophyly of the Pelecaniformes. Cracraft 
employed both morphology and behavior and used cladistic methods as a means 
to build trees and to test for group monophyly. Cracraft concluded that the order 
was monophyletic, that the traditional relationship among families (e.g., Wetmore, 
1960) was correct, that tropicbirds and frigatebirds were members of the order, and 
that Balaeniceps rex was not (see below). Soon after, Sibley and colleagues (Sibley 
et al., 1988; Sibley and Ahlquist, 1990), employing DNA-DNA hybridization tech- 
niques and UPGMA clustering methods, presented novel results for pelecaniform 
relationships. They concluded that Pelecaniformes were paraphyletic. The original 
six families comprising the order were now distributed among three large taxon 
groups and some with startling sister-group relationships (e.g., pelicans and herons; 
frigatebirds and a clade comprising penguins, loons, and albatrosses). Most recently, 
Hedges and Sibley (1994) recovered the 12S and 16S mitochondrial DNA nucleo- 
tide sequences of representative pelecaniform taxa and outgroups, and generally 
affirmed the results obtained earlier by Sibley and colleagues (Fig. 6.1). 

This sharp divergence between morphological and molecular-based research is 
troubling, because we expect that phenotypic and genotypic characters generally 
should correlate given the same evolutionary history. While most morphological 
and molecular studies on the same groups do differ in some details, few deviate as 
much as those done on Pelecaniformes (Bledsoe and Raikow, 1990). One possible 
cause for the disparity between the morphological results compared to the molecu- 
lar data may be that Pelecaniformes as traditionally constituted is a privative group. 
Privative groups are those founded on the absence of traits or on traits so ill-defined 
that no inclusive group exclusively possesses them. While steganopody, precourt- 
ship "wing flipping," or lack of an incubation patch may appear to be well- 
characterized traits, they may in fact be descriptive only of homoplasious similari- 
ties, or worse, of a conflation of similar but distinct traits, each with independent 
histories (see Siegel-Causey and Kharitonov, 1997). 
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Aristotle objected to privative groups because they could not be logically sub- 
divided and because no group could be characterized by the absence of nothing. 
Further problems obtain with privative groups, notably instability in higher order 
relationships and assignment of group membership (Lynch and Renjifo, 1990). If 
privative, at best the assemblage currently considered as constituting Pelecaniformes 
presents many problems in interpreting the nature of morphological character states; 
at worst, it is an artifical taxon confounding our understanding of morphological 
evolution in birds. 


B. The Enigma of Balaeniceps 


Another long-standing controversy that relates directly to the issue of pelecaniform 
monophyly and higher level relationships is the status of the shoebill, Balaeniceps rex. 
The controversy began one year after its description in 1850 by Gould (1852). He 
considered this “new and most remarkable form” to be a long-legged type of pele- 
canid, but Jardine (1851) considered the differences as outweighing the similarities 
to pelicans and instead declared it a member of the Ciconiiformes (herons, storks). 
Subsequent authorities agreed and differed only in whether it was a heron, a stork, 
or a monotypic family. Cottam (1957) reasserted Gould's original impression of 
pelecanid affinities by examining similarities in cranial osteology. Cottam identified 
several osteological features shared by Balaeniceps and pelecanids, and not shared 
with other putative sister-groups. For example, in shoebills and pelicans the external 
nares are posterior to the internal nares, the prevomer is weakly developed, the 
palatines are ankylosed, the hypocleideum is fused to the sternal carina (also found 
in frigatebirds), the stomach has a pyloric chamber, and the syrinx lacks intrinsic 
muscles. Cottam's findings were ignored by most except for Olson (1979), and 
Cracraft (1985) concluded that the pelecaniform characters adduced by Cottam 
were either convergent or primitive. Cottam’s work preceded the advent of modern 
phylogenetic systematics, so the analysis lacks currently accepted rudiments such as 
character assessment and polarization by comparison to outgroups. Subsequent mo- 
lecular evidence (e.g., Hedges and Sibley 1994), however, support Cottam's conjec- 
ture and pair the shoebill and pelicans as sister taxa (Fig. 6.1). 

Several points can be made about pelecaniform relationships as they are known 
today. First, more recent results suggest that Pelecaniformes is not monophyletic, 
particularly in that the group relationship of tropicbirds and frigatebirds is contro- 
versial. Second, the phylogenetic relationship of the shoebill to Pelecaniformes is 
unknown. Third, knowledge of pelecaniform relationships is central to understand- 
ing evolutionary patterns in the early evolutionary history of birds, with respect not 
only to past and current treatment of morphological data, but in what claims can 
be made about early evolutionary branching patterns in birds. 

This chapter describes preliminary results of the molecular systematics of the 
traditional Pelecaniformes. Results are compared with previous molecular studies 
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and with reanalysis of existing data as a means to understand some of the disparity 
in findings. Some potential problems in the characterization of morphological and 
behavioral traits traditionally used in phylogenetic analyses are discussed, and some 
possible new approaches in character analysis are suggested. 


П. METHODS 
A. Data Sets and Analysis 


Morphological data used here are primarily those published in morphological, os- 
teological, and behavioral studies (Cottam, 1957; van Tets, 1965; Cracraft, 1985; 
Saiff, 1978; Siegel-Causey, 1986a,b, 1988, 1991, 1992). Numerous characters have 
been proposed, but many were judged to be homoplasious, ill-defined, or traits 
with indeterminable polarities (Siegel-Causey, 1996). Two analyses were per- 
formed, one with all 20 characters, and another with two classic characters removed 
(e.g., 19, steganopody; 20, gular pouch; Table I). 

Molecular data sets are from molecular studies by the author and Hedges and 
Sibley (1994). Partial DINA sequences of three mitochondrial genes (128 rRNA, 
16S rRNA, and cytochrome B mtDNA), totaling about 1.5 kb of aligned sequence, 
were obtained from 2 to 5 individuals of the following 12 species of birds: Magel- 
lanic penguin (Speniscidae: Spheniscus magellanicus), Laysan albatross (Diomedeidae: 
Diomedea immutabilis), short-tailed shearwater (Procellariidae: Puffinus tenuirostris), 
Leach’s storm petrel (Hydrobatidae: Oceanodroma leucorhoa), white-billed tropicbird 
(Phaethontidae: Phaethon lepturus), magnificent frigatebird (Fregatidae: Fregata mag- 
nificens), white pelican (Pelecanidae: Pelecanus erythrorhynchus), gannet (Sulidae: Mo- 
rus bassana), double-crested cormorant (Phalacrocoracidae: Phalacrocorax auritus), 
anhinga (Anhingidae: Anhinga anhinga), shoebill (Balaenicepididae: Balaeniceps rex), 
and black stork (Ciconiidae: Ciconia nigra). The sequenced regions correspond to 
sites 1765—2040 (12S rRNA), 2800-3750 (16S rRNA), and 14,905—15,275 (су- 
tochrome B) in the published sequence of chicken Gallus gallus; Desjardins and 
Morais, 1990). 

DNA was recovered from frozen tissue obtained by the author, by collection or 
loan, using standard protocols. Tissue was ground to a powder in liquid nitrogen; 
approximately 100 mg of tissue was digested in a buffer containing proteinase K 
and nonionic detergents or containing guanidinium isothiocyanate. Following in- 
cubation, the solution was extracted twice with equal volumes of phenol: chloro- 
form :isoamyl alcohol (24:24:1). The DNA was precipitated with ethanol and sus- 
pended in 50 ш of Tris-EDTA buffer. 

Gene regions were amplified by polymerase chain reaction (PCR) using con- 
served primer regions among vertebrates (Kocher et al., 1989; Hedges and Sibley, 
1994). Single-stranded DNA was made by asymmetric amplification with one of 
the two primers limiting, and purification of the template and sequencing were by 
methods described elsewhere (Siegel-Causey, 1997). The resulting PCR product 
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TABLEI Data Set of Traditional Characters Used in Analyses 





Character’ 
Taxon 123 45 67 8 9 10 11 12 13 14 15 16 17 18 19 20 
Outgroup 000000000 0 0 0 0 0 0 0 0 0 0 0 
Diomedeidae 11 1000 0 0 0 0 0 0 0 0 0 
Ciconiidae 1 110000 0.0 00 0 0.0 0 0 0 
Phaethontidae 1110000 0 0 0 0 0 0 0 0 1 1 
Fregatidae 101110100 0 0 0 0 0 0 0 1 1 
Pelecanidae 101111100 0 0.0 0 00 O0 1 1 
Balaeniceps T 11111 1.0 0-0 0. 0..0..0..0 20 OQ: 0:0 
Sulidae I 0 T 1.1 1.0-1 T -1 1 d T 0 .0.-0..0- 1. 1 
Phalacrocoracidae 1 1 O 1 1 1 1 0 1 1 1. 1 1 1 1 13 1 1|» 1 1] 
Anhingidae JH Олар X 0L. 1 1 1 1 1 1 1 1 1 1 


“Characters were based on traditional analyses of morphology and behavior. 

"Character descriptions are as follows: (1) vascular notches in cranial metotic process (0, absent; 1, 
present) (Saiff, 1988); (2) cranial nerve IX foramen (0, indistinct; 1, separate) (Saiff, 1988); (3) cranial 
nerve V foramen (0, posterior or subequal; 1, anterior) (Saiff, 1988); (4) bony nostril (0, present; 1, 
absent, reduced, or indistinct) (Cracraft, 1985); (5) incubation (0, never by feet; 1, usually by feet) (Cra- 
craft, 1985); (6) stapedial canal (0, present; 1, absent) (Saiff, 1988); (7) eustachian canal (0, open; 1, 
partially or completely closed) (Cracraft, 1985); (8) hypocleideum (0, always free; 1, usually fused to 
sternal carina) (Cottam, 1957); (9) maxillopalatines (0, large; 1, reduced) (Cracraft, 1985); (10) braincase 
width to depth (0, subequal; 1, wider than deep) (Cracraft, 1985); (11) orbital process of quadrate (0, 
large; 1, reduced) (Cracraft, 1985); (12) lateral wall of presphenoid sinus (0, present; 1, absent, reduced, 
or indistinct) (Cracraft, 1985); (13) hop display (0, present; 1, absent) (van Tets, 1965); (14) sky-pointing 
display (0, absent; 1, present) (van Tets, 1965); (15) postorbital processes (0, present; 1, absent) (Cracraft, 
1985); (16) lateral wall of presphenoid sinus (0, without bony ring; 1, with bony ring) (Cracraft, 1985); 
(17) interorbital septum (0, present; 1, absent) (Cracraft, 1985); (18) occipital style (0, absent; 1, present) 
(Cracraft, 1985); (19) totipalmate foot (0, absent; 1, present) (Aristotle); (20) gular pouch (0, absent; 1, 
present) (Aristotle). 


was visualized with ethidium bromide staining to verify product band size and pu- 
rified from primers (Wizard PCR preps; Promega, Madison, WI). Cycle sequencing 
was carried out using nonradioactive labels (Genius System; Boehringer Mann- 
heim, Indianapolis, IN). The cycle sequencing product was run out on a 6% dena- 
turing polyacrylamide gel. Sequences discussed in this chapter have been deposited 
in the GenBank database (accession Nos. L33368—133397, U83149—U823160, 
U83203, U83204). 

Alignments of putative homologous sequences were done using GCG (Genetics 
Computing Group, Madison, WI) with multiple alignment parameters of fixed and 
floating gap penalty equal to 10 and pairwise parameters of gap penalty equal to 3 
and k-tuple equal to 1. Insertion and deletion gaps were coded as missing (Swofford, 
1993). Phylogenetic signal within the data was assessed using the g, statistic of the 
random tree distribution (Hillis and Huelsenbeck, 1992). Character states were po- 
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larized by a conservative reference to outgroup taxa; i.e., characters with ambiguous 
or polymorphic states among outgroup taxa were discarded (see Wiley et al., 1991). 
Parsimony analyses were performed using the heuristic search option in PAUP with 
random addition of tàxa (Swofford, 1993). The neighbor-joining method (Saitou 
and Nei, 1987) using the Jukes and Cantor (1969) and Kimura (1980) models of 
nucleotide substitution and maximum-likelihood method was implemented using 
PHYLIP (Felsenstein, 1991). 


III. RESULTS 
A. Molecular Evidence 
Using the sequences and alignment procedures described above, a random tree dis- 


tribution was generated using PAUP and having a g, = —0.615, which indicates 
that these data are significantly skewed at P < 0.01 (Hillis and Huelsenbeck, 1992). 


Anhingidae 
Phalacrocoracidae 


Sulidae 


Fregatidae 





Diomedeidae 





Procellariidae 
Cathartidae 


Phaethontidae 


Balaenicipitidae 
Pelecanidae 


Ciconiidae 








Spheniscidae 








FIGURE 6.2 The single most-parsimonious tree (length = 1149) based on mitochondrial 125-165 
rRNA and cytochrome B nucleotide sequences for pelecaniforms and relatives resulting from maximum 
parsimony analysis using a heuristic search by PAUP (CI = 0.519, RC = 0.192). Majority-rule bootstrap 
percentile values generated by 2000 replications are shown on branches. 
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A heuristic search with random sequence addition and tree-bisection-reconnection 
branch swapping resulted in a single most-parsimonious tree (Fig. 6.2), with a 
length of 1149 steps, a CI excluding uninformative characters of 0.519, and a re- 
scaled consistency index (RC) of 0.192. The confidence in the parsimony tree was 
assessed by 2000 bootstrap replications; all but two bootstrap percentile (BP) values 
were greater than 50. The nodes joining the putative clade comprising Hydrobati- 
dae and Cathartidae, and this with the Phaethontidae, were found in only 41 and 
45% of the replications, respectively. The “core” pelecaniform taxa (Anhingidae, 
Phalacrocoracidae, Sulidae) were robustly supported by bootstrap replications with 
BP values of 98 and 100. Other groupings were less well supported by the data 
(e.g., the clade comprising Phaethontidae + Diomedeidae + Fregatidae + “core” 
pelecaniforms was found in 51% of replications), but most bootstrap percentile val- 
ues exceeded 80. One of the most robust groupings (BP = 95) was the clade 
Pelecanidae + Balaenicipitidae, which in this analysis was revealed as the sister- 
group to the remaining traditional pelecaniform taxa. 


B. Morphological and Behavioral Evidence 


Analyses using all characters and with two characters removed produced similar 
results; the larger data set yielded a proportionally longer and less robust tree 
(length = 23; CI = 0.899; RC = 0.891; g, = —1.433, P < 0.01). The two char- 
acters in question, characters 19 (steganopody) and 20 (gular pouch), are tradition- 
ally invoked for pelecaniform monophyly. To examine more closely the nature of 
the other characters used in the analysis, the following discussion is limited to the 
data set with characters 19 and 20 removed. 

Similar reconstruction methodology was used for the reduced morphological 
data set as with the molecular data; a single tree was found (length = 19 steps; CI = 
0.944; RC = 0.914; g, = —1.811, Р < 0.001). The “core” pelecaniform taxa were 
similarly well supported by the traditional data, with 100% of the bootstrap repli- 
cations recovering the clade comprising Sulidae + Phalacrocoracidae + Anhingi- 
dae (Fig. 6.3). The sister-group to this clade (Fregatidae + Balaeniciptidae + Pele- 
canidae) was less supported in terms of bootstrap replications (BP = 68) in a 
polytomy; however, members uniquely shared a single morphological synapo- 
morphy (character 8: hypocleideum fused with sternal carina). The Phaethontidae, 
the remaining family of traditional pelecaniform taxa, was not found to be a mem- 
ber of the preceding clade. All branchings except for that including Balaeniceps were 
supported by three to seven synapomorphies. 


IV. DISCUSSION 


The preliminary analyses of the molecular data and the traditional data reported 
here agree in several important aspects. First, there 1s joint support for a “core” clade 
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Pelecanidae 
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FIGURE 6.3 The single most-parsimonious tree (length = 19) based on morphological and behav- 
ioral characters for pelecaniforms and relatives resulting from maximum parsimony reconstruction using 
a heuristic search by PAUP (CI — 0.944, RC — 0.914). Majority-rule bootstrap percentile values gen- 
erated by 2000 replications are shown on branches. 


of pelecaniform taxa, i.e., Anhingida + Phalacrocoracidae + Sulidae. This finding 
is reassuring in that it follows that of nearly every previous analysis done for Pele- 
caniformes. While consensus does not necessarily equate to reliability, it strongly 
suggests that morphology, behavior, and DNA sequence data are indicative of a 
common evolutionary pattern. Second, a close relationship between shoebills and 
pelicans is supported by both molecular and morphological data. It thus appears that 
Cottam’s (1957) conjecture may be correct in that Balaeniceps rex is not the shoebill 
stork, but the shoebill pelican! 

Except for the "core" pelecaniform taxa, and the close relationship between 
shoebills and pelicans, nearly every modern analysis of pelecaniform relationships 
has produced different results. Nonetheless, what is quite apparent is that none of 
the results discussed here supports monophyly of the Pelecaniformes. If the tradi- 
tional Pelecaniformes are thus paraphyletic, this brings into question how tradi- 
tional pelecaniform characters are characterized. 

Closer examination of the data supporting the basal branches obtained in the 
tree based on traditional data (Fig. 6.3) illustrates some of the problems associated 
with the conflict among previous studies of Pelecaniformes. The four characters 
supporting the branch including pelecaniform taxa except for tropicbirds appear 
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problematic. For example, character 5 [incubating by feet rather than by incubation 
patch (character 41 of Cracraft 1985)] is by definition acceptable, but the behavior 
clearly is associated with the loss of an incubation patch. Furthermore, incubation 
behavior in Pelecaniformes is not uniform and there is variation among species and 
even individuals in whether the eggs are covered by the feet during incubation or 
whether they are held between the feet and body (Siegel-Causey, 1987, 1988; van 
Tets, 1965). Similarly, character 6 [lack of a stapedial canal in the opisthotic cranium 
(Saiff, 1978)] represents the loss or absence of a character that 1s otherwise present 
in outgroups. 

Six of the eight characters supporting basal branches in the morphological tree 
shown in Fig. 6.3 (i.e., characters 3—8) can be questioned on the grounds that the 
characters represent loss of features, that they possess a derived state synapomorphy 
that is homoplasious in other related taxa, or that they comprise a potentially con- 
founded set of independent traits. The two characters most commonly invoked as 
indicative of pelecaniform monophyly are those that have been least studied of all 
(i.e., characters 19 and 20). Surprisingly, a concerted search of the considerable 
literature on pelecaniform anatomy did not reveal a single study on these two fea- 
tures, except for repeated mentions that they exist. The aberrant forms of the gular 
and the nonconforming appearance of the webbing in Fregatidae and Phaethonti- 
dae argue for a closer examination of this morphology and for consideration of the 
possibility that these two morphological traits have a more complex evolutionary 
pattern than thought previously. As suggested at the beginning, the traditional avian 
order of Pelecaniformes may be privative, and therefore only an arrangement of 
taxa associated by artificial characteristics. 

Analagous findings were reported by Hedges and Sibley (1994) using a different 
set of taxa and molecular data (i.e., mitochondrial 12S and 16S genes). Tree topolo- 
gies differed substantially from those reported here, although they also found strong 
support for the "core" pelecaniform taxa, and association between the shoebill and 
pelicans. In addition, they found no support for monophyly of the Pelecaniformes, 
and none for association of Fregatidae with other traditional pelecaniform taxa. 

Data set incongruity as commonly discussed with respect to morphological vs 
molecular data (e.g., Patterson et al., 1993; Miyamoto and Fitch, 1995) may lay 
more in the unsuitability of particular characters used in traditional analyses, and in 
the conceptual and methodological gulf separating distance and cladistic analyses. 
Although molecular studies of Pelecaniformes are still in an elementary stage, the 
results to date are heuristic and indicate the need for reexamination ofthe morpho- 
logical characters traditionally utilized in systematics research, and further investi- 
gation into the status of Pelecaniformes as a natural taxon. 
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INTRODUCTION 


Perhaps no group of birds has engendered more controversy over their relationships 
and evolutionary history than have the paleognaths, including both the flightless 
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ratites and the tinamous. The ratite birds, in particular, have interested and per- 
plexed avian systematists for well over 100 years. Early workers, noting the absence 
of a keel on the sternum, placed the ratites together (Merrem, 1813; Lesson, 1831), 
and this view was later reinforced by Huxley (1867), who noted that they share a 
so-called “dromaeognathous palate.” 

Other avian systematists, on the other hand, denied a close relationship, believ- 
ing instead that the similarities among ratites were due to convergence (Fiirbringer, 
1888, 1902; Parker, 1895). Certainly a primary reason for discounting a close rela- 
tionship among the ratites was that their widely disjunct distributions—interpreted 
within the context of a stable-continent paradigm—could seemingly be explained 
only by assuming independent origins from separate, volant ancestors. 

Most subsequent investigators during the first half of this century did not accept 
paleognath monophyly, thinking instead that all the major groups of paleognaths 
arose independently from various carinate ancestors (see Sibley and Ahlquist, 1990, 
for a review). Using behavioral similarities, Meise (1963) was among the first to 
present an explicit hypothesis of paleognath monophyly as well as for the rela- 
tionships among the families. A decade later, Cracraft (1974) presented a cladistic 
hypothesis of paleognath monophyly and ratite interrelationships using an osteo- 
logical data set. 

Although most subsequent authors have supported or accepted the strict mono- 
phyly of the paleognaths (Ho et al., 1976; Prager et al., 1976; Rich, 1979; de Boer, 
1980; Sibley and Ahlquist, 1981, 1990; Stapel et al., 1984, Cracraft, 1981, 1986, 
1988; Bledsoe, 1988; Cracraft and Mindell, 1989; Cooper et al., 1992), a minor- 
ity of opinion has questioned that hypothesis (Feduccia, 1980, 1985; Houde and 
Olson, 1981; Olson, 1985; Houde, 1986, 1988). 

Within the framework of paleognath monophyly, the debate has shifted to con- 
troversies over the interrelationships among the ratites. At the heart of this debate 
are apparent conflicts over, first, what morphological characters appear to be telling 
us about those interrelationships, and second, the seemingly disparate relationships 
implied by several different molecular data sets, on the one hand, and morphology, 
on the other. 

Figure 7.1 presents six ofthe most recent hypotheses of paleognath interrelation- 
ships. A cladistic analysis of osteological data (Cracraft, 1974; Fig. 7.1A) confirmed 
earlier suppositions that the tinamous are the sister-group to the ratites and that 
emus and cassowaries are closely allied. The morphological data also linked rheas 
and ostriches on the basis of a large suite of shared derived characters and 
were consistent with the hypothesis that the kiwis of New Zealand were at the base 
of the ratite tree and apparently related to New Zealand moas (the latter group not 
shown on Fig. 7.1A). This morphological hypothesis was challenged by Sibley 
and Ahlquist (1981) on the basis of DNA hybridization data and by the use of a 
"personal communication" from H. H. Bledsoe as well as from J. J. Baker and 
C. McGowan that a reinterpretation of the morphological data is not congruent 
with the Cracraft (1974) hypothesis but is consistent with the DINA hybridization 
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Tinamidae Tinamidae Tinamidae 
Apterygidae Apterygidae Apterygidae 
Dromiceidae Dromiceidae Dromiceidae 
Casuariidae Casuariidae Casuariidae 
Struthionidae Struthionidae Struthionidae 
Rheidae Rheidae Rheidae 


Morphology: Cracraft (1974) 


DNA Hybridization: Sibley and Ahiquist (1981) 


Morphology: Bledsoe (1988) 


Tinamidae Tinamidae SSS = Tinamidae 
Apterygidae Apterygidae l Apterygidae 
Dromiceidae Dromiceidae ! Dromiceidae 
1 

Casuariidae Casuariidae Casuariidae 
Rheidae Struthionidae Struthionidae 
Struthionidae Rheidae Rheidae 

DNA Hybridization: Sibley and Ahlquist (1990: Fig. 326) DNA Hybridization: Sibley and Ahlquist (1990: Fig. 354) 12S rRNA sequences: Cooper et al. (1992) 


FIGURE 7.1 Six hypotheses for the relationships among ratite birds. See text. 


tree (Fig. 7.1B). The reinterpretation of Bledsoe was published 7 years later (Bled- 
soe, 1988; Fig. 7.1C), and Bledsoe’s tree is identical to that of Cracraft (1974) with 
the exception of the placement of the kiwis close to emus and cassowaries rather 
than at the base of the ratites. Baker and McGowen’s analysis has not yet been pub- 
lished; nevertheless, Sibley and Ahlquist (1990, p. 283) cited once again a personal 
communication to the effect that those authors were unable to duplicate Cracraft's 
results. 

The DNA hybridization data themselves have remained consistently ambiguous. 
The hypothesis of Sibley and Ahlquist (1981; Fig. 7.1B) was not duplicated by a 
subsequent study (Sibley and Ahlquist, 1990), and in the latter at least two different 
hypotheses were inferred from the DNA hybridization data depending on different 
assumptions of the clustering algorithms (Fig. 7.1D and E). One ofthese hypotheses 
placed rheas and ostriches together, just as inferred from morphology. The primary 
difference, however, between these results and those of morphology was the place- 
ment of the kiwis as the sister-group to the emus and cassowaries on the molecular 
tree. A similar result was also found by Cooper et al. (1992; Fig. 7.1F), using se- 
quences from a small fragment of mitochondrial 12S rRNA. 

The conflicts among these hypotheses exemplify the tone of the debates that 
often arise when morphological and molecular data give different results. Within 
ornithology, as elsewhere within systematics, it has often been the case that the 
morphological data are dismissed out of hand with the claim of "convergence" (e.g., 
Sibley and Ahlquist, 1990, p. 283), even when the molecular data themselves have 
either been poorly or incorrectly analyzed or, when correctly analyzed, are am- 
biguous. Nor does it help if the molecular data are dismissed by those convinced 
that morphology is a more persuasive indicator of relationships. In either case, ques- 
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tions still remain: Are rheas and ostriches related, or is one or the other linked to 
the other ratites by a short internode? Are kiwis close to emus and cassowaries or 
are they basal to other ratites? If the former, then how does one explain the perva- 
sive reversal of many morphological characters to a more primitive condition? If 
kiwis are basal, what factors within the molecular data are leading them to cluster 
kiwis with emus and cassowaries? Answers to these questions have ramifications for 
data analysis well beyond the paleognaths. 

In an effort to investigate paleognath interrelationships further we have aug- 
mented both morphological and molecular data sets. The morphological characters 
described by Cracraft (1974) and Bledsoe (1988) were reevaluated to verify codings 
and diagnoses and to attempt a resolution of the discrepancies between their analy- 
ses. Several new postcranial characters are described and, in addition, new charac- 
ters based on an examination of cranial morphology are added to the osteological 
database. To examine the results provided by molecules, a new data set has been 
assembled on the basis of nucleotide sequences from the complete 16S rRNA and 
cytochrome b genes, complete tRNA’, as well as large portions of the 125 rRNA, 
cytochrome oxidase I (COI), and cytochrome oxidase П (COIT) genes of the mi- 
tochondrial genome. In addition, a 361-base pair (bp) fragment of the 12S rRNA 
gene published by Cooper et al. (1992) is also included in the analysis. In all, 5444 
bp of sequence are used in this study, more than in any previous comparative analy- 
sis of avian sequences. 


II. MATERIALS AND METHODS 


A. Taxa 


1. Osteological Analysis 


Skeletons were examined from the following institutions: the Field Museum of 
Natural History (FMNH; Department of Geology and the Division of Ornithology 
within the Department of Zoology, Chicago, IL); the American Museum of Natu- 
ral History (AMNH; Department of Ornithology and the Department of Verte- 
brate Paleontology, New York, NY). These include Tinamidae: Tinamus tao kleei 
(FMNH 315145), Tinamus major (FMNH 104192, AMNH 3675), Crypturellus 
cinnamomeus (FMNH 104259), Crypturellus undulatus (FMNH 290488, AMNH 
6481, 6479), Crypturellus noctivagus (AMNH 10444, 10443), Nothoprocta cineres- 
cens (AMNH 6505), Eudromia elegans (AMNH 8678), Rhynchotus rufescens (AMNH 
6605); Dinornithidae: Megalapteryx didinus (FMNH PA177), Emeus crassus (FMNH 
РАЗА), Dinornis maximus (FMNH PA35 AMNH VP7303), Dinornis robustus 
(AMNH VP80, VP81), Pachyornis elephantopus (AMNH VP7307, VP7313), Emeus 
sp. (AMNH VP69), Euryapteryx sp. (AMNH VP7309); Struthionidae: Struthio 
camelus (FMNH 106776, 313619, 104586, AMNH 1503, 1294, 4474, 3199); Rhei- 
dae: Rhea americana (FMNH 105749, 104061, 105636, AMNH 2875, 3783, 6470), 
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Rhea sp. (AMNH 2300); Casuariidae: Casuarius casuarius (FMINH 314889, 93274, 
AMNH 1106), Casuarius unappendiculatus (FMNH 104271), Casuarius sp. (AMNH 
1517, 1554); Dromaiidae: Dromaius novaehollandiae (FMNH 104536, 313620, 
AMNH 1513, 3742, 18458); Apterygidae: Apteryx australis (FMNH 85775—a ju- 
venile, 314890, AMNH 3738, 4437, 5372, 18456), Apteryx sp. (AMNH 3739); 
Galliformes/Anseriformes: Megapodiidae: Megapodius freycinet (FMNH 104631, 
AMNH 1389), Leipoa ocellata (FMNH 105235, 1359); Craciidae: Crax mitu 
(AMNH 3815), Nothocrax urumutum (AMNH 6043); Anhimidae: Anhima cornuta 
(FMNH 105629, 105812, 104293). 


2. Molecular Analysis 


Tissue samples (muscle, liver) for DNA extraction were obtained from the frozen 
tissue collection at the Museum of Natural Science at Louisiana State University 
(LSUMZ; Baton Rouge, LA), Bernice P. Bishop Museum (BPBM; Honolulu, 
HI), and the American Museum of Natural History. These include Tinamus major 
(LSUMZ B15087), Nothoprocta perdicaria (AMNH 10558), Struthio camelus (LSUMZ 
B8610), Rhea americana (LSU B8608), Dromaius novaehollandiae (LSUMZ B8607), 
Casuarius bennetti (BPBM 109892), and Apteryx australis mantelli (LSUMZ B8606). 
Because Cooper et al. (1992) did not include Tinamus or Nothoprocta in their 12S 
rRNA study, it was necessary to use their tinamou species, Eudromia elegans, to com- 
plete the tandem alignment of the sequences. 


B. DNA Extraction and Sequencing 


DNAs were extracted by boiling minute quantities of muscle or liver in 250 pl of 
596 (w/v) Chelex (Bio-Rad, Hercules, CA) suspension for 15 min, with occasional 
vortexing. The Chelex resin was pelleted by spinning it for 30 sec in a benchtop 
centrifuge, and the supernatant used as a template for polymerase chain reaction 
(PCR) amplifications. In the majority of cases, 10-Ш reactions containing 1 yl of 
Chelex-extracted DNA, 1 ul of 2 mM dNTPs, 2 ul of Turbo buffer [250 mM Tris 
(pH 8.5), 10 mM MgCb, 100 mM KCI, bovine serum albumin (BSA, 2.5 mg/ 
ml)], 0.75 U of Тад polymerase (Promega, Madison, WI), and 1 yl of each 10 uM 
primer, were sealed in glass capillary tubes for cycling in an Idaho Technologies air 
thermocycler. Reaction conditions were as follows: 94°C, 5 sec; 45—50? C, 2 sec; 
72°C, 15 sec; for 35—40 cycles. Aliquots (5 ul) were separated on 2% low-melting- 
point agarose gels and visualized with ethidium bromide and ultraviolet (UV) light. 
Plugs were taken from appropriate-sized bands with sterile Pasteur pipettes and 
melted at 72°C for 15 min in 250 ul of H2O. The gel-purified DNA was used as a 
template to prepare clean PCR products for direct sequencing. Reactions (40 ul) 
were assembled, containing 1.5 ul of DNA, 4 ul of 2 mM dNTPs, 8 ul of Turbo 
buffer, 0.75 U of Taq polymerase, and 2 ul of each primer. PCR conditions were as 
follows: 94°C, 10 sec; 48—52? C, 5 sec; 72°C, 25 sec; for 40 cycles. Aliquots (5 ul) 
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were examined on agarose gels, and the remainder further purified with the BIO 
101, Inc. (San Diego, CA). GeneClean II system, ultimately resuspending the total 
DNA from each PCR reaction in 6 ul of HO. 

DNA was sequenced using dye-terminator chemistry in an Applied Biosystems, 
Inc. ABI; Foster City, CA) 377 automated sequencer. Reactions contained 0.6 ul of 
DNA from the procedure described above, 1.4 ul of H5O, 1.1 ul of 10 uM primer, 
and 2.9 ul of ABI Dye Terminator Cycle Sequencing Ready Reaction with 
AmpliTaq DNA polymerase, FS. Each DNA sample was sequenced in both direc- 
tions. An initial denaturation step of 1 min at 95°C was followed by 32 cycles of 
95? C, 10 sec; 50°C, 5 sec; 60°C, 3 min; and a 4°C soak cycle. Completed reac- 
tions were cleaned by passage through Sephadex G-50 spin columns for 3 min at 
3000 rpm in a benchtop centrifuge. Samples were dried in a Speed-Vac (Savant, 
Hicksville, NY) for 30 min, and resuspended in loading buffer consisting of Blue 


TABLEI. Primers Used to Amplify and Sequence Mitochondrial Genes Used in 
This Study ^^ 





Primer Sequence 
16S rRNA 
L2311 5' CAAAGCATTCAGCTTACACC 3' 
L2313 5' CAAAGCATTCAGCTTACACCTG 3^* 
H2688 5' CTCGGTAGGCTTTTCACCTCTAC 32% 
12703 57 AGCAGAGGTGAAAAGCC 3' 
H2901 5' TCTTTTGTTGGTGGCTGCTT 5 
1.2909 5 TGTAGACCTTCAAGCAGCCA 5 
Н3287 5' TTGATTGCGCTACCTTTGCACGGTTAGG 3 
H3620 5' GGTCCATTGCTCAATTATATTGGG 3; 
1.3450 5! GAAGACCCTGTGGAACTTTAA 3: 
H4015 5' GGAGAGGATTTGAACCTCTG 3 
L3183 5’ AAGGAACTCGGCA 3 
H3171 5' TGCCGAGTTCCTT 3 
H3426 5! AGGGTCTTCTCGTC 3’ 
Cytochrome 6 
L14827 5' CCACACTCCACACAGGCCTAATTAA 3' 
Н15298 5' AAACTGCAGCCCCTCAGAATGATATTTGTCCTCA gree 
L15068 5' АСТАССААТАСАСТАСАСАССАСА J 
H15505 5' CTGCATGAATTCCTATTGGGTTGTTTGATCC 3' 
L15311 5' CTACCATGAGGACAAATATC 32% 
Н15712 5 GCGTATGCGAATAGGAAATA 32 
115656 5' ААССТАСТАССАСАСССАСА м 
H16065 5' GGAGTCTTCAGTCTCTGGTTTACAAGAC Ste 


(Continues) 
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Dextran-250 mM EDTA:formamide ( 1:6, v/v; ABI). Samples were run on 596 
Long Ranger (FMC, Philadelphia, PA) gels in Tris- borate- EDTA (TBE) buffer at 
3 V, 50°C for 3—4 hr. Sequences were edited and assembled with Sequencher 3.0 
(Gene Codes, Ann Arbor, Michigan) software. 

In the early phases of the study portions of the 16S rRNA gene sequences were 
obtained using nonautomated methods. The protocol of Allard et al. (1991) was 
followed for amplification of single-stranded DNA, which was then directly se- 
quenced in both directions by the Sanger et al. (1977) dideoxy chain-termination 
method. Portions of the gene were sequenced using the Promega f-mol system, 
which utilizes a modified Taq polymerase to sequence double-stranded DNA in a 
thermal cycler under a high annealing temperature. 

Primers used for PCR amplifications and cycle sequencing are listed in Table I. 
All sequences obtained in this study have been deposited in GenBank under the 


TABLEI. (Continued) 


Primer Sequence 
COI: 
L6611 5' TCGAACCTCTGTAAAAAGGACTAC aur 
L6615 5' CCTCTGTAAAAAGGACTACAGCC 3) жж 
L6955 5' СССАТСААСААСАТААССТТСТС 3% 
H7002 5' CATCCTGTGCCGGCTCCAGCTTC 3% 
H7032 5' TTGCCAGCTAGTGGGGGGTA Aut 
L7318 5' ACATTCTTCGACCCAGCCGGAGG 3h 
H7350 5 ACTTCGGGGTGTTCCGAAGAATCA ate 
H7662 5” AGGAAGATGAATCCTAGGGCTCA 22% 
СОН: 
18419 5' TTCCACGACCACGCCCTAATAGT 2:2 
Н8844 5° TGGTTTAGTCGTCCAGGGATTGCGTC 3798 
18740 5” GGCCACTTCCGACTACTAGAAGT 3% 
Н9085 5! CAGGGGTTTGGGTTGAGTTGTGGCAT 3% 
L8309 5' CTGTCAAGACTAAATCACAGG 3' 
H8907 5' CCGCAGATTTCTGAGCATTGACC 3' 
12S rRNA‘ 
L1264 5! CAAACAAAGCATGGCACTGAAG 3'* 
H1861 5' TCGATTATAGAACAGGCTCCTC 3'* 





^Numbers refer to base position on the Gallus gallus mitochondrial genome (Desjardin and Morais, 
1990) 

^ All primers developed in our laboratory except for the following: *, primer courtesy of G. Е. Barrow- 
clough and J. Groth; **, primer courtesy of workers in A. C. Wilson laboratory (Berkeley, CA); ***, 
primer courtesy of D. Mindell. 

“See also Cooper et al., 1992, for 12S rRNA primers used in their study. 
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following accession numbers (16S rRNA, 12S rRNA, cytochrome 6, COI, COII, 
and tRNA'»): 076036-1076077. 


C. Phylogenetic Analysis 


Sequences of the 12S rRNA and all protein-coding genes were aligned by eye. 
The 16S rRNA gene sequences had hypervariable regions that could not be easily 
aligned by that method, therefore they were aligned using the MALIGN algorithm 
(Wheeler and Gladstein, 1992). Following this procedure, however, regions of both 
the 16S rRNA and 128 rRNA genes were considered to have such ambiguous align- 
ments that homology statements about base positions were virtually arbitrary; these 
regions were eliminated from the phylogenetic analyses. These eliminated regions 
correspond to the aligned sequence between the following pairs of base positions 
(determined against the published Gallus sequence; Desjardin and Morais, 1990) for 
165 rRNA and 12S rRNA, respectively: positions 2364—2385, 2396-2408, 2472- 
2477, 2514-2517, 2641—2644, 2758-2764, 2772-2774, 2779-2809, 2869—2890, 
2934—2943, 2969—2971, 3260—3267, 3466—3478, 3502—3512, 3926—3951, 3962- 
3966; and positions 1383-1390, 1421-1426, 1520-1524, 1852-1860. In total, 
225 base positions (characters) were eliminated, which also included gaps in the 
preceding sequence positions that were required by the alignment with the paleo- 
gnath taxa. 

Morphological character-states were polarized into primitive or derived states 
by the method of outgroup comparison. Character-states occurring in both in- 
group and outgroup taxa were hypothesized to be primitive, whereas states re- 
stricted to ingroup taxa were considered derived (Hennig, 1966; Eldredge and 
Cracraft, 1980; Nelson and Platnick, 1981). 

Outgroups were chosen on the basis of recent phylogenetic hypotheses for the 
basal clades of birds. Morphological (Cracraft, 1986, 1988) and molecular evidence 
(Prager et al., 1976; Stapel et al., 1984; Cracraft and Mindell, 1989) support the 
monophyly of the Neornithes. Within this clade, two monophyletic groups are 
supported by a diverse array of evidence: (1) the Paleognathae (Bock, 1963; Ho 
et al., 1976; Prager et al., 1976; Stapel et al., 1984; Cracraft, 1986, 1988; Bock and 
Buhler, 1988; Cracraft and Mindell, 1989; Sibley and Ahlquist, 1990); and (2) the 
Neognathae (Ho et al., 1976; Stapel et al., 1984; Cracraft, 1986, 1988; Cracraft and 
Mindell, 1989; Sibley and Ahlquist, 1990). Within the Neognaths, the galliforms 
and anseriforms are postulated to be the basal clade (Cracraft, 1988; Cracraft and 
Mindell, 1989; Sibley and Ahlquist, 1990). The primary clades of the Anseriformes 
are the Anhimidae, the Anatidae, and the Anseranatidae. The Anhimidae are the 
sister-group of the other two lineages and are osteologically more primitive (Live- 
zey, 1986; Sibley and Ahlquist, 1990). The main groups of the galliforms include 
the Cracidae, the Megapodiidae, the Odontophoridae, the Numididae, and the 
Phasianidae. The cracids and megapodes have been postulated as being basal lineages 
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within the order (Cracraft, 1981). From these taxa, representatives from the Anhim- 
idae, Megapodiidae, and Cracidae were chosen as outgroups for the morphological 
analysis. Gallus gallus was chosen to root the paleognath molecular tree (but see 
discussion below). 

The two primary lineages of the paleognaths are the ratites and the tinamids. 
Various studies (Cracraft, 1974, 1981, 1986, 1988; Prager et al., 1976; Stapel et al., 
1984; Bledsoe, 1988; Sibley and Ahlquist, 1990) have proposed that the tinamids 
are the sister-groups of the ratites. Using this finding as a working hypothesis, 
character-states that are present in basal neognaths and tinamous but not in ratites 
are here postulated to be primitive, whereas those restricted to one or more ratite 
taxa are derived. 

Phylogenetic analyses of all data were performed using the exhaustive search 
algorithm of test version 4.0d46 of PAUPSTAR (Phylogenetic Analysis Using Par- 
simony; D. Swofford, 1996, personal communication). Different approaches to 
molecular data analysis were undertaken (Cracraft and Helm-Bychowski, 1991), as 
described below. 


III. RESULTS 
A. Morphological Analysis 


A parsimony analysis of 58 skeletal characters (ordered as stated in Appendix I) 
resulted in a single most parsimonious tree of 80 steps, with a consistency index of 
0.846 excluding uninformative characters (Fig. 7.2; see Appendix I for a detailed 
discussion of the characters and character-states supporting this hypothesis). The 
galliform and anseriform taxa were combined into one outgroup taxon since all of 
their character codings were identical in this analysis. Although the monophyly of 
the paleognaths is assumed here on the basis of other evidence not included in this 
study (Lowe, 1928; Bock, 1963; Ho et al., 1976; Prager et al., 1976; de Boer, 1980; 
Stapel et al., 1984; Cracraft, 1974, 1981, 1986, 1988; Bledsoe, 1988; Cracraft and 
Mindell, 1989; Sibley and Ahlquist, 1990), two synapomorphies were identified in 
this data set that corroborate their monophyly: (17) shallow transverse ligamental 
sulcus of the humerus, and (34) internal cnemial crest of the tibiotarsus extended 
proximally beyond the articular surface. 

Twenty characters support the monophyly of the ratites (Fig. 7.2). These char- 
acters include the following: (1) the loss of a keel, (6) the sternum essentially equal 
in width and length, (7) the loss of the sternal manubrium, (8) the fusion of the 
scapula and coracoid, (11) the humerus at least one-third longer than the ulna, 
(12) projection of the internal tuberosity of the humerus medially and proximally, 
(13) the deltoid crest of the humerus reduced to a small ridge, (14) proximal protru- 
sion of both margins of the carpal trochleae greatly reduced so that both are flat- 
tened, (16) one metacarpal with phalangeal articulation, (18) reduced external epi- 
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Galliformes/ 
Anseriformes 
Tinamidae 
dinornithids 
3, 5,92), 20(2), 38, 
54, 55, 56, 57, 58 
Apteryx 
29(0), 440) та 
Casuarius 
1,6,7, 8, 11, 12, 
13, 14Q), 16, 18, 
19, 28, 29, 37, 
39, 42, 43, 44, 
49, 53 
Dromaius 
Struthio 
9(3), 10, 14(1), 16(0), 
21, 23, 24, 26, 272), 
2902), 33(2), 362), Rhea 


41, 45(2), 46, 48,51, — 3 
57(3) 


FIGURE 7.2 Most parsimonious solution (80 steps; CI= 0.846) using a revised morphological data 
set of 58 characters. All state changes in the characters represent transformations from 0 to 1 (Appendix 
I) unless noted in parentheses. Underlined characters represent parallelisms. Character transformations 
were ordered as discussed in Appendix I. Bootstrap percentages from 200 resamplings are shown in 
boldface within brackets. 


condyle of the humerus, (19) narrowing of dorsal surface of synsacrum caudal to 
antitrochanter, (28) the intercotylar prominence equal in proximal extension with 
the hypotarsus on the tarsometatarsus, (29) external cotyla of tarsometatarsus slightly 
concave, and internal surface deeply concave with concurrent loss of sharp proximal 
protrusion on the medial margin, (37) a deep groove for the peroneus profundus 
muscle on the tibiotarsus, (39) a deep pit anteriorly and a groove posteriorly on the 
internal condyle of the tibiotarsus, (42) trochanteric crest of femur essentially on 
same level with the iliac facet, (43) the iliac facet ofthe femur with a rounded edge, 
loss of a lip, and convex to flattened surface, (44) external and fibular condyles of 
femur greatly enlarged and projecting distally beyond level of internal condyle, 
(49) posterior facet of internal condyle of femur triangular in shape, and (53) pro- 
jection of zygomatic process of squamosal anterolaterally over at least two-thirds of 
the body of the quadrate. 

The monophyly of the dinornithids and Apteryx (Fig. 7.2) is supported by 10 
characters. The uniquely derived characters include the following: (5) a flattened 
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sternum, (9) the coracoidal process of the scapulocorocoid in the form of a medi- 
olaterally compressed ridge, (20) the anterior portion of the ilia more elongated 
than the posterior portion, (38) strong anterior projection of the internal condyle 
of the tibiotarsus, (54) the pterygoid divided into dorsal and ventral surfaces, form- 
ing a pterygoid fossa, (55) the intercondylar fossa of the quadrate with a deep, 
rounded pit, (56) mammillar tuberosites highly developed on the posterolateral 
margins of the basitemporal plate, (57) the antrum of the maxillopalatine present 
and ankylosed to the dorsal surface of the posterior maxillary, and (58) the third 
vestibule of the nasal region formed into an olfactory chamber. In addition, dinor- 
nithids and Apteryx share one derived character that is postulated to be homopla- 
sious on the topology of Fig. 7.2: (3) the coracoidal sulci of the sternum laterally 
displaced (the sulci are also displaced in Rhea, but it is not as extreme, and its ho- 
mology to the condition in dinornithids and Apteryx is questionable). 

Ten characters support the monophyly of Struthio, Rhea, Dromaius, and Casuarius 
(Fig. 7.2), including nine that are postulated to be uniquely derived: (20) the pos- 
terior portion of the ilium longer than the anterior portion, (27) the internal ridge 
of the hypotarsus greatly reduced and a proximally protruding process on the exter- 
nal ridge, (30) the possession by the tarsometatarsus of a deep and narrow anterior 
metatarsal groove, (31) a sharp ridge present on the external side of the posterior 
shaft of the tarsometatarsus, (32) loss of digit I, (33) the base of the cnemial crests of 
the tibiotarsus compressed mediolaterally and the interarticular surface narrowed 
anteriorly, (35) supratendinal bridge of the tibiotarsus lost, (40) the anterior inter- 
condylar fossa of the tibiotarsus widened, undercutting the condyles and forming a 
sharp ridge, and (52) the presence of an elongate supraorbital process on the lacri- 
mal. One other synapomorphy, not uniquely derived, also supports the monophyly 
of the Struthionoidea: (2) loss of posterior lateral processes on the sternum. 

The monophyly of Dromaius and Casuarius (Fig. 7.2) is supported by seven 
uniquely derived characters and three that are homoplastic. The former include the 
following: (4) posterior extension of the ventral lip of the coracoidal sulci on the 
sternum, (15) the presence of a phalangeal articulation for the os metacarpale alu- 
lare, (22) a club-shaped expansion of the distal ilium, (25) transverse processes of 
the sacral vertebrae broadened and fused to form a ventral plate of bone, (47) the 
external condyle of the femur elliptical in shape and projecting proximally, (50) a 
shallow pit on the external condyle of the femur cutting into the fibular condyle, 
(57) an antrum present as a large "pocket" formed from the maxillopalatines and 
also ankylosing anteriorly with the posterior margin of the maxillary. The three 
homoplasious synapomorphies include the following: (36) the external condyle of 
the tibiotarsus flattened along the distal margin with its anterior portion slightly 
undercut (parallel in dinornithids), (45) the fibular condyle ofthe femur level proxi- 
mally with the external condyle and rounded posterolaterally (also found in dinor- 
nithids), and (49) posterior facet of internal condyle of femur ovoid, a reversal. 

Finally, 18 characters support the monophyly of Struthio and Rhea (Fig. 7.2). Of 
these, 16 are uniquely derived: (9) coracoidal process of the scapulocoracoid pro- 
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nounced and knoblike and projecting toward the glenoid facet, (10) glenoid facet 
oriented dorsolaterally, (14) internal and external margins of the carpal trochlea 
essentially level with each other and well rounded, (21) postacetabular ilium nar- 
rows dorsoventrally and mediolaterally and tapers to a conical shape, (23) pubis 
elongated beyond the ilium and fused to the ischium, (24) obturator process of the 
ischium fused to the pubis to form the obturator foramen, (26) presence of a pu- 
boischial bar, (27) internal ridge of the hypotarsus lost and external ridge knoblike 
proximally, (33) surface of anterior interarticular area greatly reduced as the base of 
the crests becomes sharply compressed, (36) external condyle of the tibiotarsus 
ovoid distally and sharply undercut anteriorly, (41) external condyle of the tibiotar- 
sus possessing a moderate lateral extension, sharply undercut posteriorly, (45) pos- 
terior margin of the fibular condyle of the femur rounded and does not extend as 
far proximally as the internal condyle, (46) internal condyle of the femur flattened 
distally, (48) rotular groove of the femur narrow and deep, (51) popliteal fossa of 
the femur very deep, and (57) maxillopalatine antrum greatly reduced (see Appen- 
dix I). There are two homoplastic characters: (16) three metacarpals with phalangeal 
articulation, considered a reversal in Fig. 7.2, and (29) external and internal cotylar 
surfaces of the tarsometatarsus essentially level (also coded the same in dinornithids). 


B. Sequence Analysis 


The sequences reported here total 5444 bp from the mitochondrial genome, in- 
cluding 1682 bp of aligned sequence representing the entire 16S rRNA gene, 583 
bp of the 12S rRNA gene determined in our laboratory, 361 bp of the 12S rRNA 
gene previously reported by Cooper et al. (1992), 1011 bp ofthe COI gene, 592 bp 
of the СОП gene, 1143 bp of the entire cytochrome b gene, and 72 bp of tRNA +”. 
Because transition to transversion ratios between Gallus and the paleognaths, and 
even within most paleognath comparisons, are approaching unity (Table П), it ap- 
pears that transition substitutions are nearing saturation. Therefore, the parsimony 
analysis reported here was performed using transversions only (analyses on all sub- 
stitutions produced comparable results). In addition, because small “hypervariable” 
regions of the 16S rRNA and 12S rRNA genes were difficult to align, rendering 
homology statements among the taxa problematic at best, 225 bp of these genes 
were excluded; a total of 5219 bp was used in the analyses reported in this section. 
Finally, gaps in the sequences were treated as characters. 

A transversion parsimony analysis of all 5219 bp yields a single most parsimoni- 
ous tree of 1357 steps (Fig. 7.3A) using Gallus as a root for the tree. It is apparent 
from Tables II and HI that Gallus is quite distant (8— 1096 transversion difference) 
from the paleognaths, and indeed extensive comparisons (see Section IV) suggest 
that Gallus is attaching to long branches within the paleognaths. Although both 
tinamous have very long branches, their clear synapomorphous sequence similarity 
unites them relative to the other taxa. The most parsimonious solution shows Ap- 
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TABLE П Pairwise Distances between Taxa Based on 5219 bp of 
Mitochondrial Sequences ^^ 


Taxon Gallus Apteryx Rhea Struthio Casuarius Dromaius Nothoprocta Tinamus 
Gallus = 0.170 0.157 0.161 0.163 0.158 0.188 0.184 
Apteryx 471/414 — 0.129 0.132 0.118 0.114 0.165 0.168 
Rhea 422/397 418/254 — 0.128 0.124 0.128 0.167 0.164 
Struthio 421/417 444/246 398/272 -- 0.118 0.122 0.164 0.167 
Casuarius 451/398 437/180 403/245 387/231 — 0.085 0.159 0.162 
Dromaius 431/395 419/178 417/249 416/219 346/95 == 0.153 0.159 
Nothoprocta 481/500 474/389 470/403 435/419 443/388 410/386 — 0.159 


Tinamus 456/502 479/400 446/410 450/423 450/397 428/400 467/365 = 


*See materials and methods. 
"Absolute transition/transversion distances below diagonal, mean uncorrected distances above 
diagonal. 


teryx as the sister-group of Casuarius and Dromaius, Struthio as their sister-group, and 
then Rhea as theirs, with all groupings having strong bootstrap support (Fig. 7.3A). 

To examine the influence that the long branch of the outgroup may be having 
on the topology, Gallus was eliminated and the two tinamous were used to root the 
tree. The resulting most parsimonious solution was a tree of 1068 steps (Fig. 7.3B), 
identical in topology to the tree of Fig. 7.3A, but now bootstrap support has weak- 
ened. Indeed, the second most parsimonious tree, only 5 steps longer at 1073 steps, 
has a topology identical to the morphological tree of Fig. 7.2. 


C. Total Evidence Solution 


A combined analysis was undertaken using all the sequence data, as above, and the 
58 morphological characters previously discussed. With Gallus as the root of the 
tree, a single most parsimonious tree of 1450 steps was found (Fig. 7.4A) that was 
identical to the morphological tree (Fig. 7.2). In this solution morphological char- 
acters clearly influence the phylogenetic signal by uniting Rhea and Struthio, on the 
one hand, and Casuarius and Dromaius, on the other, and linking these clusters as 
sister-taxa, although bootstrap support for the latter relationship is not strong. The 
next most parsimonious tree was two steps away at 1452 steps and was identical in 
topology to the molecular tree of Fig. 7.3A. 

When Gallus was excluded, however, and the tinamous used as the root, a single 
most parsimonious tree was found at 1144 steps that was identical to the tree in- 
cluding Gallus, but now bootstrap support increased substantially (Fig. 7.4B). The 
next most parsimonious tree was nine steps away at 1153 steps and had the kiwi as 
the sister-group of the emu and cassowary. 
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FIGURE 7.3 Transversion parsimony analysis of 5219 bp from the mitochondrial 165 rRNA, 125 
rRNA, COI, СОП, cytochrome b, and (КМА? genes. (A) A single most parsimonious tree of 1357 steps 
resulted when using Gallus gallus as the root. (B) Gallus gallus has been excluded and two tinamous are 
used to root the tree. A single most parsimonious tree of 1068 steps was found. Bootstrap percentages 
from 200 resamplings are shown in boldface within brackets above branches, and branch lengths are 
shown below. 


IV. DISCUSSION 


A. Ratite Interrelationships: 
Morphological Data 


The relationships proposed here are identical to those proposed by Cracraft (1974; 
see Fig. 7.1A) and contradict the hypothesis proposed by Bledsoe (1988; see our 
Fig. 7.1C). Bledsoe (1988) hypothesized that the kiwis are more closely related to 
the emus and cassowaries than to all remaining ratites. Our reevaluation of Bledsoe’s 
characters, however, lead us to a different interpretation of the morphological data. 

Six of Bledsoe’s characters involve wing elements: humerus (Bledsoe’s character 
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19), ulna-radius (26), and carpometacarpus (30, 31, 32, 34). Wing elements are 
greatly reduced in the kiwis, more so than in the other living ratites, the reduction 
presumably due in part to the much smaller size of the kiwis although it may be a 
secondary reduction shared with dinornithids. Because of this reduction in the size 
of the wing elements, most of the distinguishing processes and articulations are 
difficult to determine or differentiate (as was confirmed by an examination of con- 
siderable skeletal material), thus we conclude the forearm synapomorphies de- 
scribed by Bledsoe (1988) for Apteryx, Casuarius, and Dromaius are ambiguous. The 
other characters involve the femur (44, 48), the tibiotarsus (56, 64), and the tarso- 
metatarsus (71). On reanalysis, each of these characters except one (44) was found 
to be uninformative either because it also occurred in the outgroup, it was too 
variable to distinguish a well-defined character-state, or because the character could 
not be determined (for more detail, see Appendix II). On the basis of the data, our 
reexamination of Bledsoe's characters (1988) shows weak support for Bledsoe's pro- 
posed sister-group relationship of Apteryx to Dromaius and Casuarius and instead 
supports the hypothesis of Fig. 7.2. 


B. Why Do Molecular and Morphological 
Data Conflict? 


The results of the morphological data (Fig. 7.2) confirm the topology found previ- 
ously (e.g., Cracraft, 1974). This hypothesis, in which Apteryx is basal to the other 
ratites, is well supported as measured by high bootstrap values. The mitochondrial 
sequence data, on the other hand, come to a different conclusion—that Apteryx is 
linked with Casuarius and Dromaius; this hypothesis is also strongly supported by 
bootstrap procedures (Fig. 7.3A). Significantly, previous molecular studies cited 
earlier (Fig. 7.1) all fail to place kiwis at the base of the ratites. What might be the 
reasons for the conflict between morphological and molecular data sets? 

One answer, of course, is that both trees are wrong, and that more data will be 
needed to discover the true tree. A second answer is that the molecular data are 
yielding the true tree. If this is the case, however, then either the morphologi- 
cal data have been seriously misinterpreted with respect to homology statements 
(which does not seem likely given different investigators having obtained roughly 
similar results) or there has been rampant homoplasy in morphology. That the mor- 
phological data are severely incongruent with the molecular tree is easily shown by 
optimizing the morphological data onto it (after excluding the dinornithids). This 
exercise results in a tree 25 steps longer relative to the morphological tree (without 
dinornithids) and also results in at least one internodal branch length of zero. A third 
alternative is that the morphological tree is correct, in which case one would infer 
that the molecular data possess some systematic error (see below). On face value, 
this third hypothesis might be considered more acceptable than the second inas- 
much as the combined analysis (Fig. 7.4) indicates that a relatively small amount of 
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FIGURE 7.4 А total evidence solution based оп 5219 bp from the mitochondrial 16S rRNA, 12S 
rRNA, COI, СОП, cytochrome b, and tRNA!” genes as well as 58 morphological characters. (A) А 
single most parsimonious tree of 1450 steps was found with Gallus gallus as the root ofthe tree. (B) With 
Gallus gallus excluded, and with two tinamous as the root, a single most parsimonious tree of 1144 steps 
was found. Both trees are identical to the morphological tree of Fig. 7.2. Bootstrap percentages from 
200 resamplings are shown in boldface within brackets above branches, and branch lengths are shown 
below. All character transformations treated as in independent analyses shown in Figs. 7.2 and 7.3. 


morphological data (58 characters) can outweigh a large amount of sequence data 
(5219 bp). The conclusion from the above analysis is that the molecular data fit 
the morphological hypothesis better than the morphological data fit the molecular 
hypothesis. 

This last interpretation, if true, implies that the molecular topologies (Fig. 7.3A 
and B) are misleading with respect to the true relationships. The most obvious sus- 
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FIGURE 7.5 The conflict between morphological and molecular data sets seems to be a rooting prob- 
lem because the ingroup topologies of the unrooted trees are identical. If the root joins to Apteryx, the 
resulting tree is like the morphological results of Fig. 7.2, whereas if the root joins to Rhea, the result is 
like the molecular tree of Fig. 7.3. It is suggested in text that long-branched outgroups such as Gallus 
and the tinamous are preferentially joining to a long-branched taxon of the ingroup, Rhea, owing to 
homoplasy generated by long-branch attractions. Morphological data are probably less susceptible to 
long-branch attractions, as discussed in text. 


picion is that incorrect topologies are arising because of variation in evolutionary 
rate among the different lineages, that is, they exemplify a "long branches attract” 
problem or artifact (e.g., see Felsenstein, 1978; Penny et al., 1987; Smith, 1994; 
Halanych, 1996; among many others). The proposition that the phylogenetic signal 
contained in both the morphological and molecular data sets is not actually different 
but that their apparent conflict is simply a rooting problem is shown in Fig. 7.5: if 
the root joins the lineage leading to Apteryx, one has the morphological tree; if it 
joins to R/iea, one obtains the molecular tree. 

Examination of the molecular data and results is consistent with the above expla- 
nation. Tinamous, separately or together, have long branches relative to the ratites 
(see Sibley and Ahlquist, 1990, p. 810, Fig. 325; also Mindell et al., 1996), and the 
long-branched outgroup, Gallus, not unexpectedly attaches to tinamous in the most 
parsimonious solution (Fig. 7.3A). Likewise, with Gallus excluded from the analy- 
sis, the tinamous join to another long branch, Rhea (Fig. 7.3B). This suggests the 
possibility that Apteryx is clustering with Casuarius and Dromaius merely as a con- 
sequence of Rhea and Struthio being “pulled down" to the base of the ratite tree by 
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TABLE III Pairwise Transversion Maximum Likelihood Distances for 5219 bp of 
Mitochondrial Sequences’ 


Taxon Gallus Apteryx Rhea Struthio Casuarius Dromaius Nothoprocta 





Gallus — 

Apteryx 0.077 — 

Rhea 0.075 0.049 — 

Struthio 0.079 0.047 0.054 — 

Casuarius 0.073 0.032 0.047 0.044 = 

Dromaius 0.073 0.033 0.048 0.042 0.016 — 

Nothoprocta 0.096 0.075 0.080 0.084 0.075 0.075 — 
Tinamus 0.096 0.078 0.080 0.083 0.076 0.077 0.070 


“See materials and methods. 


the outgroup, thereby precluding the placement of Apteryx at the base as suggested 
by the morphological and total evidence solutions (Figs. 7.2 and 7.4). 

Other factors may be exacerbating the effect of long-branch attraction. First, it 
is apparent from the transversion distance matrices (Tables I and II), as well as from 
DNA hybridization distances (Sibley and Ahlquist, 1990, p. 810), that the radiation 
of the paleognaths is relatively ancient. Palaeognaths are at the base of the modern 
avian tree (Cracraft and Mindell, 1989; Sibley and Ahlquist, 1990; among others), 
and diversification among most of the paleognath lineages was likely in the Creta- 
ceous. The mere fact that they represent ancient lineages creates certain difficul- 
ties for the systematic analysis of DNA hybridization distances and mitochondrial 
gene sequences because of the increased homoplasy associated with deep branching 
events. 

An equally important factor, perhaps, is the observation that the major paleo- 
gnath lineages are seemingly exemplary of a star phylogeny, that is, they diverged 
from one another relatively close in time. Transversion distances (Tables II and III), 
fitted DNA hybridization distances (Sibley and Ahlquist, 1990, p. 810, Fig. 325), 
and fitted maximum likelihood distances derived from the mitochondrial sequences 
(Fig. 7.6) all suggest that the lineages leading to Casuarius/ Dromaius, Struthio, and 
Rhea diverged from one another relatively closely in time (but this picture may be 
distorted by rate differences among the lineages). Thus, internodal distances are 
generally much shorter than the branches leading to the terminal taxa. 


V. CONCLUSIONS 


The above arguments point to the morphological hypothesis as being best supported 
by all the data. Although the placement of the kiwi remains uncertain, both mor- 
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FIGURE 7.6 А maximum-likelihood tree of the molecular data (5219 bp; transversion parsimony; 
rates assumed to follow gamma distribution with shape parameter of 0.5). The fitted distances suggest 








short internodes among the major lineages of ratites, a result also found with DNA hybridization dis- 
tances (Sibley and Ahlquist, 1990). The fact that the branches of Gallus, tinamous, Rhea, and Struthio are 
long relative to those of Apteryx, Casuarius, and Dromaius is also apparent. 


phological and total evidence solutions place it at the base of the ratites, and a re- 
evaluation of the morphological evidence confirms a sister-group relationship with 
moas (contra Cooper et al., 1992). 

A pertinent question is, what is the generality of these results? It was argued 
above (Fig. 7.5) that the debate over the placement of Apteryx on the tree seems 
dependent on where the outgroup joins to the ingroup. Because DNA hybridiza- 
tion distances (Sibley and Ahlquist, 1990) suggest that the relative rates in the nu- 
clear genome essentially parallel those in the mitochondrial genome, there is at least 
some reason to predict that any long-branch artifacts of clustering arising within 
one data set might also be present in the other. Thus, the fact that there is congru- 
ence in results from DNA hybridization studies and those based on mitochondrial 
gene sequences does not necessarily constitute a sufficient reason for believing that 
molecular data are producing a robust estimate of relationships. Indeed, ifinternodal 
distances are short, it might be expected that in many instances morphology will 
actually have a better chance of resolving relationships than will molecular data. 
This follows from the observation that, even with short internodes, a suite of mor- 
phological markers (synapomorphies) can arise to characterize a clade, whereas in- 
sufficient molecular change will have accumulated during this time interval be- 
tween branching events. The data presented here can be interpreted from this 
perspective, and a similar explanation has been proposed for corvine birds in which 
morphological data clearly unite manucodes with other birds of paradise but se- 
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quence data do not, except when close outgroups are chosen (Helm-Bychowski 
and Cracraft, 1993). Another factor is that phylogenetic reconstructions of molecu- 
lar data, whether they be distances or discrete characters, might be expected to be 
more susceptible to rate effects than are morphological data merely because mor- 
phological data sets are rarely large enough to show sufficient "random" or parallel 
homoplasy to create long edges; usually the investigator consciously chooses char- 
acters that do not exhibit rampant variation—thus, parallelism—among the taxa; 
or the characters are chosen because they suggest synapomorphy rather than auta- 
pomorphy, the latter of which contributes to long edges. To our knowledge, no 
morphological study has invoked a long-edges-attract argument for a particular tree 
topology, although morphological systematists perhaps need to take this possibility 
more seriously. 

One important conclusion from this study is that difficult problems in avian sys- 
tematics will not be resolved by small amounts of DNA sequence or by inadequate 
taxon sampling. Although larger mitochondrial data sets have been shown to out- 
perform smaller data sets in recovering relationships among relatively divergent taxa 
(Cummings et al., 1995; Mindell and Thacker, 1996), it remains to be demonstrated 
whether any amount of sequence data will be able to resolve certain phylogenetic 
questions having characteristics similar to the ones surrounding the paleognaths: a 
distant outgroup, taxa having a deep history, a small number of taxa in the groups 
being compared, and/or short internodal distances relative to long terminals. Mo- 
lecular data present many severe analytical difficulties, and thus morphological in- 
vestigations will take on increasing importance for groups such as the paleognaths. 
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Morphological Character Descriptions 
for Hypothesis of Fig. 2 


The primitive state is designated as (0), the derived states as (1, 2, 3) and unknown 
as (2); for multistate characters the order is listed under each character when neces- 
sary. The postcranial analysis is based on a reexamination of the characters described 
by Cracraft (1974) and Bledsoe (1988). Their analyses included Aeypornis, and Bled- 
soe’s (1988) included the Dromornithidae, but because of a lack of available speci- 
mens, and the fact that dromornithids are probably not paleognaths, both groups 
were excluded from this analysis. With respect to the postcranial characters de- 
scribed by Cracraft (1974), four were modified before inclusion in this study. A 
reevaluation of Bledsoe (1988) resulted in the modification of 30 characters and the 
exclusion of 42 (these are discussed separately in Appendix II). Seventeen characters 
are unique to this analysis: 7 postcranial and 10 cranial. 

Characters and character-states used for the morphological analysis of this chap- 
ter include the following: 


Character 


1 2 3 4 5 6 7 8 9 10 1 12 13 14 15 


Galliformes / 

Anseriformes 0 0 0 0 0 0 0 0 ? 0 0 0 0 0 0 
Tinamidae 0 0 0 0 0 0 0 0 ? 0 0 0 0 0 0 
Dinornithids 1 1 1 ? 1 1 1 1 2 0 ? ? ? ? ? 
Apteryx 1 1 1 В 1 1 1 1 2 0 1 1 1 2 2 
Casuarius 1 1 0 1 0 1 1 1 1 0 1 1 1 2 1 
Dromaius 1 1 0 1 0 1 ] ] 1 0 1 1 1 2 1 
Struthio 1 1 0 0 0 1 1 1 3 1 1 1 1 1 0 
Rhea 1 1 1 ? 0 1 1 1 3 1 1 1 1 1 0 
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Character 


16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 








Galliformes/ 

Anseriformes 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
Tinamidae 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
Dinornithids ? ? ? 0 2 0 0 0 0 0 0 2 1 2 0 
Apteryx 1 1 1 1 2 0 0 0 0 0 0 2 1 0 0 
Casuarius 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 
Dromaius 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 
Struthio 0 1 1 1 1 1 ? 1 1 0 1 1 1 2 1 
Rhea 0 1 1 1 1 1 2 1 1 0 1 1 1 2 1 

Character 
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
Galliformes/ 

Anseriformes 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
Tinamidae 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
Dinornithids 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 
Apteryx 0 0 0 1 0 0 1 1 1 0 0 1 1 0 1 
Casuarius 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 
Dromaius 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 
Struthio 1 1 2 1 1 2 1 0 1 1 1 1 1 1 2 
Rhea 1 1 2 1 1 2 1 0 1 1 1 1 1 1 2 

Character 
46 47 48 49 50 51 52 53 54 55 56 57 58 
Galliformes/ 

Anseriformes 0 0 0 0 0 0 0 0 0 0 0 0 0 
Tinamidae 0 0 0 0 0 0 0 0 0 0 0 0 0 
Dinornithids 0 0 0 1 0 0 0 1 1 1 1 1 1 
Apteryx 0 0 0 1 0 0 0 1 1 1 1 1 1 
Casuarius 0 1 0 0 1 0 1 1 0 0 0 2 0 
Dromaius 0 1 0 0 1 0 1 1 0 0 0 2 0 
Struthio 1 0 1 1 0 1 1 1 0 0 0 3 0 
Rhea 1 0 1 1 0 1 1 1 0 0 0 ? 0 


1. Keel of the sternum (Cracraft, 1974, pp. 503, 506): (0) present, (1) ab- 
sent. The loss of the keel is derived for all ratites. 

2. Sternum, posterior lateral processes: (0) present, (1) absent. Cracraft 
(1974, pp. 503 and 506) took both the loss and reduction of the posterior lat- 
eral processes as being derived for all ratites. Only the presence/absence of the 
processes are recognized here as distinct character states. The complete loss of 
the posterior lateral processes in Casuarius, Dromaius, and Rhea is postulated to 
be derived. 


3. Coracoidal sulci, lateral displacement: (0) sulci meet at or near the mid- 
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line, (1) sulci displaced laterally. Cracraft (1974, p. 503) described laterally dis- 
placed sulci as the derived condition, but did not state explicitly which taxa 
display this condition. Bledsoe (1988, character 3) described the cranial margin 
as being concave cranially (Apteryx, Rhea), straight (Dinornithidae), and convex 
cranially (remaining ratites) and further recognized the articular coracoidal sulci 
(Bledsoe’s character 6) as having states (a) meeting or nearly meeting medially, 
and (b) restricted to the lateral margins. Bledsoe considered Apteryx, Rhea, and 
the dinornithids to have the derived condition. In this study only the lateral 
displacement of the sulci is described because the curvature of the cranial mar- 
gin is largely related to the location of the coracoidal sulci. Apteryx, Rhea, and 
the Dinornithidae display the derived condition, although the displacement in 
Rhea is not as extreme as it is in the other taxa. 

4. Coracoidal sulci, ventral lip: (0) extends laterally toward the base of the 
scapulocoracoid process, (1) directed more strongly posteriorly. The Tinami- 
dae, Anhimidae, and Struthio are characterized by the primitive condition. In 
the derived condition found in Dromaius and Casuarius, the ventral lip of the 
sulci angles more posteriorly toward the midline of the body of the sternum 
(Mivart, 1877, pp. 25, 33, Figs. 19 and 27). Rhea, Apteryx, and the Dinornithi- 
dae cannot be scored for this condition because of the lateral displacement of 
the entire sulci (see character 3 above). 

5. Sternum, lateral view: (0) moderate to highly curved, (1) flattened. 
Bledsoe (1988, character 5) described this condition as four character-states: 
(a) highly curved, carnia present, (b) highly curved, low medial ridge present, 
(c) moderately curved, no distinct medial ridge, and (d) flattened. Character- 
state (b) was unique to Rhea, (c) was described for Casuarius, Dromaius, and 
Struthio, and (d) for Apteryx and the dinornithids. The presence or absence ofa 
keel is described above (See character 1). Regarding the curvature of the ster- 
num, it is either curved (ancestral) or flattened (derived) and only Apteryx and 
the Dinornithidae display a flattened sternum. 

6. Proportions of the body of the sternum: (0) much longer anteroposte- 
riorly than wide mediolaterally, (1) essentially equal in width and length. Bled- 
soe (1988, character 1) coded unique proportions for Apteryx and Aepyornis 
whereas here Apteryx is coded as being the same as the other ratites, which dis- 
play the derived condition. 

7. Sternum, manubrium: (0) present, (1) absent. The loss of the sternal ma- 
nubrium is a synapomorphy for all of the ratites. 

8. Scapula and coracoid (Cracraft, 1974, pp. 503, 506): (0) not fused, 

(1) fused. The fusion of the scapula and coracoid is derived for all ratites. 

9. Coracoidal process (Cracraft, 1974, pp. 505, 507—508; Bledsoe, 1988, 
character 9): (1) absent, (2) in the form of a ridge located anteriorly to the gle- 
noid facet and compressed mediolaterally, (3) very pronounced and knoblike, 
projects toward the glenoid facet. This is an unordered character because the 
primitive condition is unknown. The fusion of the scapula and coracoid is 
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unique to the ratites (see character 9), and thus the outgroups have no homolo- 
gous condition to polarize this character. The process, absent in Casuarius and 
Dromaius, is in the form of a mediolaterally compressed ridge in Apteryx and 
the moas, and very knoblike and pronounced in Struthio and Rhea. 

10. Glenoid facet (Bledsoe, 1988, character 7): (0) oriented laterally, 

(1) oriented dorsolaterally. The dorsolateral orientation of the glenoid facet on 
the scapulocoracoid is derived for Struthio and Rhea. 

11. Humerus, length relative to the ulna (Cracraft, 1974, pp. 505-506): 

(0) shorter than the ulna, (1) at least one-third longer than the ulna. This char- 
acter combines characters 11 and 29 of Bledsoe (1988). In character 11 Bledsoe 
described the proportions of the humerus as being (a) slender, length moderate, 
(b) slender, length elongate, (c) stout, short, and (d) slender, short. Character- 
states (c) and (d) are unique for Casuarius and Dromaius, respectively, but essen- 
tially the elements of both are short. Although the humerus of these two taxa is 
short, it is at least one-third longer than the ulna. In character 29 Bledsoe de- 
scribed the lengths of the шпа radius relative to the humerus as being (a) sub- 
equal, (b) ulna radius longer than the humerus, and (c) humerus longer than 
the ulna—radius. For this latter character, state (c) inexplicably did not appear 
in Bledsoe's data matrix, (a) was ancestral, and (b) was derived for all of the 
ratites. The primitive condition is coded here as a humerus that is shorter than 
the шпа radius, and all of the ratites display the derived condition. Because 
dinornithids cannot be scored for wing elements, characters 11-18 are coded 
as unknown. 

12. Humerus, internal tuberosity: (0) protrudes slightly medially, does not 
extend proximally to the level of the head of the humerus, (1) knoblike with 
great medial protrusion, extends proximally to the level or slightly beyond the 
humeral head. Bledsoe (1988, character 14) described three character states: 

(a) head protrudes beyond tuberculum (tuberosity), (b) subequal, (c) tuberosity 
protrudes beyond head. Bledsoe indicated that Apteryx, Rhea, and Struthio have 
the tuberculum protruding beyond the head of the humerus and that Casuarius 
and Dromaius are subequal. Given the variation observed in the specimens ex- 
amined for this study, Struthio, Rhea, Dromaius, Casuarius, and Apteryx are all 
coded as having the tuberosity more or less protruding to the level of the hu- 
meral head. 

13. Deltoid crest: (0) greatly flares laterally from the base of the external tu- 
berosity and tapers to a prominent ridge as it continues distally, (1) shghtly 
raised ridge beginning from the base of the external tuberosity. Bledsoe (1988, 
character 18) described this character in terms of the absence or presence ofa 
ridge at the base of the pectoral crest, and described the ridge as being present 
in Apteryx, Struthio, and Rhea. In this study, the reduction of the crest to a 
barely perceptible ridge is postulated as being derived for all ratites. 

14. Carpal trochleae, carpometacarpus: (0) strong proximal protrusion of 
the external margin relative to the internal margin, with both margins well 
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rounded, (1) proximal protrusion of the external margin reduced so as to be 
essentially level with the internal margin, and with both margins well rounded, 
(2) proximal protrusion of both margins greatly reduced so that they are flat- 
tened. Bledsoe (1988, character 30) described the carpal trochlea as (a) highly 
curved, (b) moderately curved, or (c) flattened. Rhea and Struthio were scored as 
primitive (a), Casuarius as moderately curved, and Apteryx and Dromaius as flat- 
tened. For the specimens examined here, Struthio and Rhea are distinguishable 
from the ancestral state and display an intermediate condition in which the ex- 
ternal margin of the trochlea is greatly reduced relative to the level of the inter- 
nal margin, and both margins are well rounded. Casuarius, Dromaius, and Apte- 
ryx are considered to have the derived condition in which the trochlea are so 
reduced as to appear flattened. This character is ordered 1-0-2. 

15. Phalangeal articulation for os metacarpale alulare (Bledsoe, 1988, char- 
acter 31): (0) present, (1) absent. The loss of this articulation is derived for Cas- 
uarius and Dromaius. Apteryx is coded as questionable because the articulation 
has been found present in some specimens and absent in others. 

16. Number of metacarpals with phalangeal articulation (Bledsoe, 1988, 
character 34): (0) three, (1) one. Dromaius, Casuarius, and Apteryx are consid- 
ered to display the derived condition. 

17. Humerus, transverse ligamental sulcus: (0) deep, (1) shallow. Bledsoe 
(1988, character 12) described the sulcus as (a) deep, (b) shallow, and (c) absent. 
The Dromornithidae were the only taxon coded as lacking a sulcus. The con- 
dition in the Tinamidae (a shallow transverse ligamental sulcus) is here con- 
sidered to be homologous to that of the ratites, which display the derived 
condition. 

18. Humerus, external epicondyle: (0) well developed and pronounced, 

(1) reduced to slightly raised surface. Bledsoe (1988, character 19) described the 
epicondyle as (a) well developed, (b) moderately developed, and (c) highly re- 
duced. Bledsoe considered Struthio to be well developed, Rhea to be moderately 
developed, and Casuarius, Dromaius, and Apteryx to be highly reduced. In this 
analysis the external epicondyle is taken to be greatly reduced and derived for 
all ratites. Because the humerus of ratites is so modified, and skeletons exhibit 
considerable variation in what might be recognized as an epicondyle, it is not 
always easy to partition variation into clear character-states. 

19. Dorsal surface of synsacrum caudal to antitrochanter (Cracraft, 1974, 
pp. 503—504; Bledsoe 1988, character 35): (0) broad, (1) narrow. The nar- 
rowing of the dorsal, posterior synsacrum is derived for all ratites except the 
Dinornithidae. 

20. Relative lengths of the anterior and posterior ilia (relative to the ace- 
tabulum): (0) approximately equal to each other in length, (1) posterior portion 
longer than the anterior portion, (2) anterior portion longer than the posterior 
portion. Apparently Bledsoe (1988, character 37) miscoded this character be- 
cause the anterior portion is described as being more elongate in Rhea and Stru- 
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thio; the posterior portions are clearly more elongate than the anterior portions 
(for Rhea, see Mivart, 1877, p. 2, Fig. 2). Casuarius and Dromaius also display 
this derived condition (Mivart, 1877, pp. 16 and 27, Figs. 13 and 22). Apteryx 
and the dinornithids display the other derived condition in which the anterior 
portion is more elongate than the posterior portion (for the dinornithids, see 
Oliver, 1949; for Apteryx, Parker, 1891, plate 18, Figs. 278—279). The charac- 
ter is ordered 1—0- 2. 

21. Postacetabular ilium, lateral view (Cracraft, 1974, pp. 503—504, Fig. 9): 
(0) essentially the same height dorsoventrally and width mediolaterally, dorsal 
margin curved, (1) narrows dorsoventrally and mediolaterally, tapering to a 
conical shape, dorsal margin flattened. Bledsoe (1988, character 40) described 
this condition as the dorsoventral width of the postacetabular ilium: (a) narrow, 
(b) moderately wide, and (c) very wide. Character-state (a) was ascribed to 
Struthio and Rhea, (b) was unique for Apteryx, and (c) was described for Casua- 
rius and Dromaius. On the basis of an examination of Hesperornis (Marsh, 1880, 
plate X, Fig. 2) and the relative proportions of the other outgroups, the condi- 
tion in which the posterior ilium remains approximately the same height and 
width for its entire length is considered ancestral. The condition found in Rhea 
and Struthio in which the posterior ilium narrows in height and width, tapering 
to a conical shape, is derived. 

22. Club-shape expansion of the distal ischium: (0) absent, (1) ischium long, 
straight and narrow, ending in a hammer-like expansion that fuses or abuts 
against the posterior end of the ilium. In the outgroups, Apteryx, and the dinor- 
nithids the ischium is broad dorsoventrally for almost the entire length and the 
distal end is free (Cracraft, 1974, p. 504, Fig. 9; Oliver, 1949, Figs. 90 and 104). 
In Struthio and Rhea the ischium fuses with the pubis (see character 23 below), 
and so are coded as (?) since this character does not apply to these taxa. The 
ischium of Dromaius and Casuarius is narrow dorsoventrally and the distal por- 
tion ends in a hammer-like expansion (Mivart, 1877, Figs. 13 and 22). 

23. Elongation and fusion of the pubis (Cracraft, 1974, pp. 503—504, 508): 
(0) essentially equal in length to the ilium and unfused, (1) elongated beyond 
the ilium and fused to the ischium. Bledsoe (1988, character 38) described the 
caudal protrusion of the ilium, ischium, and pubis as (a) pubis and ischium ex- 
tend beyond ilium, (b) pubis extends beyond ischium, which extends beyond 
ilium, (c) protrusion subequal, and (d) ischium protrudes beyond subequal pu- 
bis and ilium. Character-state (d) was unique to Aeypornis. Character-state 
(b) occurred in Rhea, Struthio, and the Dinornithidae, and character-state 
(c) was found in Dromaius and Casuarius. For this analysis, the elongation of the 
pubis beyond the ilium and fusion to the ischium is found to occur only in 
Struthio and Rhea. In the other ratites and the outgroups, the posterior exten- 
sions of the pubis, ilium, and ischium are essentially the same and they are 
unfused. 

24. Obturator process of the ischium: (0) not fused with the pubis, (1) fused 
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with the pubis to form a complete obturator foramen. This is derived for Stru- 
thio and Rhea. 

25. Transverse processes of sacral vertebrae, ventral view: (0) processes sepa- 
rate and not broadened to fuse with each other, narrowly fused to anterior and 
posterior ilia, (1) processes broadened and fuse with each other and with the ilia 
to form a ventral plate of bone. The derived condition is found only in Dro- 
maius and Casuarius (Mivart, 1877, Figs. 16 and 26). 

26. Puboischial bar (Cracraft, 1974, pp. 503—504, 508): (0) absent, (1) pres- 
ent. A puboischial bar is present only in Struthio and Rhea. Bledsoe (1988, char- 
acter 42) described this character as the dorsoventral width of pubis and ischial 
bar; it appears Bledsoe is referring to the same character as a puboischial bar. 

27. Hypotarsus: (0) broad with two ridges, the internal ridge projecting 
more posteriorly than the external, (1) internal ridge greatly reduced, external 
ridge with sharp process protruding proximally, (2) internal ridge completely 
lost, external ridge knoblike at proximal end (modified from Cracraft, 1974, 
pp. 502, 506—508 and from Bledsoe, 1988, characters 66, 67, 68, and 73). Cra- 
craft (1974) described four character-states. Cracraft coded a broad hypotarsus 
with two ridges (internal being larger) as primitive, and both ridges being pro- 
nounced more posteriorly as an intermediate condition. Cracraft then recog- 
nized two derived states, one in which there is a reduction of the external ridge 
and consequent development toward a single ridge, and another being a single 
ridge located along the external side of the bone. The intermediate state united 
Apteryx with the Dinornithidae, the first derived state united Casuarius and 
Dromaius, and the second derived state united Rhea and Struthio. Bledsoe's char- 
acter 66 described Aeypornis and the Dromornithidae as differing in the shape 
of the hypotarsus from the other ratites. Bledsoe's character 67 was autapo- 
morphic for the Dromornithidae, character 68 united all of the ratites (exclud- 
ing Aeypornis and the Dromornithidae), and character 73 united Struthio, Rhea, 
Casuarius, and Dromaius. This analysis postulates Apteryx and the Dinornithidae 
to resemble the primitive condition, Casuarius and Dromaius to display an inter- 
mediate condition, and Rhea and Struthio to display a further derived condition. 
For the latter character-state, the complete loss of the internal ridge results in 
the hypotarsus being located more laterally than in the other taxa where it re- 
mains near the midline. The character order is 0-1-2. 

28. Tarsometatarsus, intercotylar prominence (Cracraft, 1974, pp. 503, 
506—507; Bledsoe, 1988, character 70): (0) proximally extended beyond hypo- 
tarsus, (1) essentially equal to the hypotarsus in proximal extension. The de- 
rived condition occurs in all ratites. 

29. Tarsometatarsus, depth of the internal and external cotylar surfaces 
(modified from Bledsoe, 1988, characters 69 and 72.): (0) external cotylar sur- 
face relatively flat, internal surface concave with sharp proximal protrusion on 
its medial margin, (1) external cotylar surface slightly concave, internal surface 
deeply concave with concurrent loss of sharp proximal protrusion on the me- 
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dial margin, (2) external and internal cotylar surfaces concave and essentially 
equal in depth. The intermediate condition occurs in Dromaius and Casuarius 
and the derived condition is found in the Dinornithidae, Struthio, and Rhea. 
Apteryx shows the primitive condition. The character is ordered 0-1-2. 

30. Anterior metatarsal groove (Cracraft, 1974, pp. 503, 507): (0) shallow 
and broad proximally, flattens out distally, (1) deep and narrow for the entire 
length. A deep and narrow metatarsal groove on the tarsometatarsus is derived 
for Casuarius, Dromaius, Struthio, and Rhea. 

31. Posterior view of shaft of the tarsometatarsus (Cracraft, 1974, p. 502, 
Bledsoe, 1988, character 74): (0) generally smooth, lacking significant ridges, 
(1) sharp ridge present on external side running proximodistally. The derived 
condition is found in Casuarius, Dromaius, Struthio, and Rhea. 

32. Trochleae/digits of the tarsometatarsus (Cracraft, 1974, p. 502): (0) four 
present, (1) loss of digit I. The loss of digit I is derived for Casuarius, Dromaius, 
Struthio, and Rhea. 

33. Cnemial crests, tibiotarsus: (0) surface of anterior interarticular area and 
base of the crests wide, (1) surface of the anterior interarticular area narrows as 
the base of the crests becomes mediolaterally compressed, (2) surface of anterior 
interarticular area greatly reduced as the base of the crests becomes sharply 
compressed. Cracraft (1974, pp. 502, 506—508, Fig. 7D—F) recognized only а 
difference between Rhea and Struthio and the other ratites. Bledsoe (1988) saw 
variation in terms of three separate character descriptions: 

Character 51. Mediolateral compression of cranial and lateral crests: (a) slight or no 
compression, (b) moderate compression, and (c) substantial compression. Character-state 

(b) was unique for the Dromornithidae and state (c) united Rhea and Struthio. 

Character 52. Extent of cnemial crest and remaining articular surface in proximal 
view: (a) cnemial crest equal in extent to remaining articular surface, (b) cnemial crest 

less extensive than remaining articular surface, and (c) cnemial crest more extensive than 

remaining articular surface. Character-state (b) was described for dinornithids, Apteryx, 

and Aepyornis, and state (c) united Struthio, Rhea, Casuarius, and Dromaius. 

Character 54. Lateral margin between lateral cnemial crest and lateral articular sur- 


face: (a) shallowly concave, and (b) deeply concave. The derived condition was restricted 
to Casuarius and Dromaius. 


In this analysis, reduction of the anterior interarticular surface and the devel- 
opment of a deeply concave lateral margin are considered to be due to the me- 
diolateral constriction at the base of the cnemial crests. Apteryx and the dinor- 
nithids resemble the ancestral condition. Dromaius and Casuarius display an 
intermediate condition in which the mediolateral compression is not as ex- 
treme as it is in Struthio and Rhea. The character is ordered 0—1—2. 

34. Tibiotarsus, proximal extension of internal cnemial crest: (0) flat to 
slightly extended above level of interarticular surface, (1) moderate to greatly 
extended beyond articular surface. In describing the proximal extension of the 
internal cnemial crest, Bledsoe (1988, character 53) recognized three character 
states: (a) moderately beyond the articular surface, displayed by Apteryx and the 
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dinornithids, (b) slightly beyond the articular surface, described for Rhea and 
Struthio, and (c) far beyond the articular surface, found in Dromaius and Casua- 
rius. The ancestral character-state found in outgroups shows the internal cne- 
mial crest flattened or only slightly extended above the level of the articular 
surface. Enough variation occurred among the specimens examined that the 
derived condition is best described as moderate to greatly extended above the 
articular surface (Cracraft, 1974, p. 501, Fig. 7A—C) and is found in all of the 
ratites and the Tinamidae. 

35. Supratendinal bridge of the tibiotarsus (Cracraft, 1974, p. 501; Bledsoe, 
1988, character 57): (0) present, (1) absent. The loss ofthe supratendinal bridge 
on the tibiotarsus is derived for Casuarius, Dromaius, Struthio, and Rhea. 

36. Tibiotarsus, external condyle (Cracraft, 1974, pp. 500—501, 507—508, 
Fig. 6A): (0) rounded along distal margin, anterior margin grades into the shaft 
smoothly (not undercut), (1) flattened along distal margin, anterior portion 
slightly undercut, (2) ovoid along distal margin, anterior portion sharply under- 
cut. When Bledsoe (1988, character 61) described this, Bledsoe only recog- 
nized a difference between the primitive condition and the derived state occur- 
ring in Struthio and Rhea. Here a derived condition is also hypothesized to 
occur in Dromaius, Casuarius, and the dinornithids (character-state 1). The 
other derived condition (character-state 2) is postulated for Struthio and Rhea. 
Apteryx resembles the primitive condition. The character is ordered 1-0-2. 

37. ‘Tibiotarsus, depression on the lateral surface of external condyle: (0) shal- 
low, (1) deep (Bledsoe, 1988, character 63). The depression is presumed to be 
the groove for the peroneus profundus muscle, and a deep groove is derived for 
all ratites. 

38. Tibiotarsus, internal condyle distal view (Cracraft, 1974, pp. 501, 507— 
508, Fig. 6B): (0) essentially level or slightly projected anteriorly relative to ex- 
ternal condyle, (1) projects strongly anteriorly relative to external condyle. The 
derived condition is found only in Apteryx and the dinornithids. 

39. Tibiotarsus, medial side of the internal condyle (Cracraft, 1974, pp. 
501, 507—508, Fig. 6B): (0) slight depression near anterior margin, (1) deep pit 
in anterior margin and a groove along the posterior margin. The derived con- 
dition occurs in all ratites. 

40. Tibiotarsus, anterior intercondylar fossa (Cracraft, 1974, pp. 498, 506, 
Fig. 5): (0) narrow and does not undercut the condyles, (1) widens and under- 
cuts the condyles at the proximal margin, forming a slight ridge that distin- 
guishes the articular surface from the fossa. The derived condition is found in 
Dromaius, Casuarius, Struthio, and Rhea. 

41. Tibiotarsus, posterior margin of the external condyle (Cracraft, 1974, 
pp. 500—501, 508, Fig. 6A; Bledsoe, 1988, character 62): (0) rounded, 
smoothly grades into shaft, (1) moderate lateral extension, sharply undercuts 
base of shaft. Only Struthio and Rhea display the derived condition. 

42. Femur, trochanteric crest: (0) extends proximally beyond the level of 
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the trochanteric fossa (iliac facet), (1) crest essentially on the same level with the 
iliac facet. Bledsoe (1988, character 44) recognized four character-states, but 
three of those four are each unique to Aepyornis, Struthio, and the dromorni- 
thids. Bledsoe described Struthio as having the trochanteric fossa extended 
greatly beyond the level of the trochanteric crest, whereas here this condition is 
considered to be homologous to that of Rhea, Casuarius, Dromaius, and Apteryx. 

43. Femur, margin of the iliac facet: (0) curved sharply to form a lip that 
faces medially, surface highly concave, (1) rounded edge, no lip present, surface 
flattened to slightly convex. A similar condition was described by Bledsoe 
(1988, character 43) for the caudal margin of the proximal antitrochanteric ar- 
ticular surface: (a) highly concave, (b) moderately concave, (c) straight or nearly 
so. Bledsoe coded character-state (b) for Struthio and Rhea, and character-state 
(c) for Apteryx. In this study no distinction could be made between moderately 
concave and straight. The derived condition occurs in all of the ratites. 

44. Femur, external and fibular condyles (Cracraft, 1974, pp. 499, 500— 
501, 507—508, Fig. 3): (0) essentially equal in size and in their distal extension 
relative to the internal condyle, (1) greatly enlarged and project distally be- 
yond level of internal condyle. Bledsoe (1988, character 46) recognized three 
character-states: (a) subequal distally, (b) lateral condyle extended slightly dis- 
tally beyond medial condyle, and (c) lateral condyle extended distally far be- 
yond medial condyle. Character-state (b) united dinornithids with Casuarius 
and Dromaius, and character-state (c) united Struthio and Rhea. The specimens 
examined exhibited so much variation that it is difficult to distinguish between 
Bledsoe’s character-states (b) and (c). Therefore, Struthio and Rhea are coded for 
the same derived condition as that in Dromaius, Casuarius, and the dinornithids. 

45. Femur, fibular condyle, posterior view: (0) relatively sharp postero- 
lateral margin, proximal margin essentially level with external condyle, 

(1) rounded posterolateral margin, proximal margin essentially level with exter- 
nal condyle, (2) rounded posterolateral margin, proximal margin not extended 
as far proximally as external condyle. Cracraft (1974, pp. 498—499, 507, Fig. 3) 
recognized only one derived condition, which Cracraft described as occurring 
in Dromaius and Casuarius. Bledsoe (1988, character 48) partitioned the varia- 
tion into two character-states: (a) lateral and fibular condyles extend distally 
about the same distance, (b) fibular condyle extends less distally (three-quarters 
or less) than the lateral condyle. Bledsoe coded Apteryx, Casuarius, Dromaius, 
and Rhea as displaying character-state (b). The dinornithids are considered here 
to have the same derived condition as Dromaius and Casuarius : a rounded (not 
sharp) posterolateral margin and a proximal margin level with the external con- 
dyle. Struthio and Rhea display a second derived condition in which the proxi- 
mal margin of the external condyle extends proximally beyond the fibular con- 
dyle. Apteryx resembles the ancestral condition. The character order is 1-0-2. 

46. Femur, internal condyle, medial view (Cracraft, 1974, pp. 498—499, 
508, Fig. 3): (0) rounded along distal margin, (1) flattened along distal margin. 
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Bledsoe (1988, character 50) recognized three character-states: (a) semicircular, 
(b) triangular or elliptical, and (c) flattened. Bledsoe considered the primitive 
condition (a) to occur in the dinornithids and Apteryx, the derived state (b) to 
occur in Dromaius and Casuarius, and state (c) to occur in Struthio and Rhea. 
Specimens examined in this study displayed no distinguishable difference be- 
tween semicircular and triangular or elliptical, so only Rhea and Struthio were 
considered to display the derived condition. 

47. Femur, external condyle viewed laterally (Cracraft, 1974, pp. 498, 507): 
(0) appears more rounded and projects more posteriorly, (1) appears more ellip- 
tical and projects more proximally. The derived condition is found in Casuarius 
and Dromaius. 

48. Femur, rotular groove (Cracraft, 1974, pp. 498, 508): (0) broad and 
shallow, (1) narrow and deep. The derived character-state is found to occur in 
Struthio and Rhea. 

49. Femur, internal condyle: (0) posterior facet ovoid in shape, (1) posterior 
facet triangular in shape. The derived condition occurs in Rhea, Struthio, Apte- 
ryx, and the dinornithids. 

50. Femur, distal end of external condyle: (0) pit for the tibialis anticus 
deeply excavated and narrow mediolaterally, (1) pit shallow and wide, cutting 
into the fibular condyle laterally. The pit for the attachment for the tibialis anti- 
cus becomes shallow and widens so as to slightly excavate laterally into the fi- 
bular condyle. The derived condition is found in Dromaius and Casuarius. 

51. Femur, popliteal fossa (Bledsoe, 1988, character 49): (0) shallow, almost 
flat, (1) very deep, extending anteriorly. Struthio and Rhea display the derived 
condition. 

52. Lacrimal, elongate supraorbital process, projecting posterolaterally over 
the orbit: (0) absent, (1) present. The process is absent in the Tinamidae, An- 
himidae, Megapodiidae, dinornithids, Apteryx and is present in Struthio, Rhea, 
Casuarius, and Dromaius. 

53. Squamosal, projection of zygomatic process: (0) projects slightly antero- 
laterally just over the articulation with the quadrate, (1) projects anterolaterally 
over at least two-thirds of the body of the quadrate. The Anhimidae, Megapo- 
diidae, and Tinamidae have a zygomatic process that just slightly projects an- 
terolaterally over the articulation with the quadrate, and this condition is here 
postulated to be primitive. All the ratites have the derived condition. 

54. Pterygoid: (0) not divided, simple rodlike structure, (1) divided into 
dorsal and ventral surfaces. The primitive condition occurs in the outgroups, 
the Tinamidae, Struthio, Rhea, Casuarius, and Dromaius. The body of the ptery- 
goid is found as a simple rodlike unit that articulates posteriorly with the quad- 
rate and the basipterygoid processes and anteriorly with the vomer and the pal- 
atines. Variation in the shape of the articulating surfaces of these elements is 
autapomorphic for each taxon. In Apteryx and the dinornithids the pterygoid is 
divided anteriorly, forming dorsal and ventral surfaces. This saddle-shaped por- 
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tion on the medial side of the bone is henceforth called the pterygoid fossa. 
The dorsal fork appears to be the main body of the pterygoid and homologous 
to the body of the pterygoid in the other paleognaths. This fork extends anteri- 
orly and ankyloses to the dorsal surface of the vomer. The ventral fork is the 
unique structure and ankyloses with the palatine anterolaterally and extends an- 
teriorly to ankylose to the ventral surface of the vomer. Parker (1891, pp. 55— 
56) described the pterygoid of Apteryx as divided anteriorly into medial and 
lateral processes, with the medial process articulating with the lateral border 

of the vomer and the lateral process articulating with the dorsolateral border 

of the palatine. Jollie (1957, p. 420) and McDowell (1948, pp. 527—528) also 
described the pterygoid in essentially the same manner. This divided pterygoid 
and the resultant pterygoid fossa are apparently not found in any other birds 
except perhaps the "Lithornithidae" (Houde, 1988, pp. 20, 47, Fig. 5). 

55. Quadrate, intercondylar fossa: (0) shallow, (1) deep, rounded pit. The 
fossa is shallow in the outgroups, Struthio, Rhea, Casuarius, and Dromaius. In Ap- 
teryx and the dinornithids the fossa is deeply marked, partially as a consequence 
of the posterior articulating surface having a strong ventral elevation. 

56. Basitemporal plate and mammillar tuberosities: (0) posterolateral mar- 
gins of the basitemporal plate flattened or just slightly developed into mammil- 
lar tuberosities, (1) posterolateral margins of the plate with well-developed 
mammillar tuberosities. The tuberosities have been shown to occur at the junc- 
tion of the basioccipital, exoccipital, and prootic bones (Parker, 1895, p. 384) 
in dinornithids and Apteryx. The tuberosities are poorly developed or absent in 
the outgroups and the other ratites. 

57. Maxillopalatine antrum: (0) absent, (1) present as a "large pocket" 
formed from the maxillopalatines with the anterior portion ankylosed to the 
dorsal surface of the posterior maxillary, (2) present as a "large pocket" formed 
from the maxillopalatines, the anterior margin ankylosing with the posterior 
margin of the maxillary, (3) greatly reduced to a "small pocket" that ankyloses 
anteriorly with the posterior maxillary. The antrum is present in all ratites (ex- 
cept Rhea, see below), but in different character-states. The antrum in all taxa 
(Parker, 1895, plate LXII, Figs. 59—63) is located on the dorsolateral surface of 
the maxillopalatine. The antrum consists of a "pocket," the floor of which is 
formed by the maxillopalatine, and is covered by a thin lamina of bone with a 
posterior opening. Pycraft (1900, pp. 185— 187) described the antrum as the 
interior half of the maxillopalatine having the outer and inner borders turned 
upward to meet in the middorsal line to form a long, thin-walled tunnel. Mc- 
Dowell (1948, pp. 527-528) described the antrum as a dorsal arched lamina of 
the maxillopalatine joining the ventral lamina at its margins, forming hollow 
cones with posterior openings. In the dinornithids and Apteryx, the dorsal and 
ventral surfaces are entirely formed from maxillopalatine bone, but the anterior 
portion ankyloses to the dorsal surface of the posterior maxillary (character- 
state 1). In the dinornithids the posterior openings are wide, whereas in Apteryx 


208 K. Lee et al. 


the posterior openings are small foramina. In some Apteryx specimens the pos- 
terior foramina could not be found. The antrum of Dromaius and Casuarius is 
also formed entirely from maxillopalatine bone, but ankyloses anteriorly with 
the maxillary (character-state 2). Struthio has the antrum reduced to a small, es- 
sentially open, anterior portion ankylosing with the posterior maxillary 
(character-state 3). Rhea displays a structure similar to that in Struthio, but it is so 
reduced that the homologous condition is uncertain. For the purposes of this 
analysis, Rhea has been coded as questionable. The presence of an antrum is 
derived; in this analysis the polarity of the character transformation beyond the 
primitive state is unordered. 

58. Olfactory chamber/tubercle: (0) ossification of chamber relatively 
poorly developed, (1) ossification of chamber well developed. The third, pos- 
teriormost vestibule of the nasal cavity, when present, is the olfactory chamber 
(Portmann, 1961, pp. 42-43). The Tinamidae, Anhimidae, and Megapodiidae, 
and all ratites except Apteryx and the dinornithids have a chamber that is poorly 
ossified. In Apteryx and the dinornithids, the olfactory chamber is enlarged and 
fused to the ectethmoid complex (Parker, 1891, pp. 48—50; 1895, p. 389). In 
Apteryx the chamber contains a complex of ossified turbinals, whereas in dinor- 
nithids the chamber is empty. 


Comments on Morphological 


Characters of Bledsoe (1988) 


Bledsoe’s (1988) characters 10, 16, 17, 20, 23, 24, 25, 36, 39, 58, 59, 65, 75, 76, 77, 
79, 81, 82, and 83 were excluded from this analysis because they were autapomor- 
phies and thus phylogenetically uninformative. There are 23 additional characters 
in Bledsoe’s analysis that have also been excluded from this study. The reasons for 
this warrant further discussion (the character numbers and descriptions correspond 
to those in Bledsoe, 1988). 


2. Craniolateral process (sternum): (a) elongate, (b) shortened, (c) very 
short. Character-state (b) was described only for the Dromornithidae. Bledsoe 
(1988) described a short craniolateral process (scapulocoracoid) to occur in 
Casuarius, Dromaius, and Struthio. In the specimens examined for this study only 
Casuarius was found to have shortened processes. The processes for Dromaius 
and Struthio did not differ from the ancestral condition, so this character is an 
autapomorphy and uninformative within ratites. 

4. Number of incisures (sternum): (a) two, (b) none, (c) four. Bledsoe 
(1988) refers to the posterior margin of the sternum and the presence or ab- 
sence of posterior medial processes. Bledsoe described four incisures unique to 
Struthio and no incisures as derived for Casuarius, Dromaius, and Rhea. The ho- 
mologous condition could not be determined among taxa due to variation 
among specimens and the loss of the posterior lateral processes in Dromaius, 
Casuarius, and Rhea (Appendix I, character 2). 

8. Medial groove of glenoid cavity (scapulocoracoid): (a) absent, (b) present. 
Bledsoe (1988) described the groove as present in Struthio and Dromaius, but in 
the specimens examined here, the groove was found to vary in occurrence 
across all ratite taxa and as such is considered uninformative. 

13. Pneumatic foramen (humerus): (a) present, (b) absent. The presence of 
this character is inconsistent across taxa. Bledsoe (1988) described the foramen 
as being absent in the dinornithids, Apteryx, Struthio, and Rhea, but it was also 
found to be absent in several specimens of Dromaius and Casuarius. 
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15. Position of the head (humerus): (a) near midline, (b) shifted dorsally, 

(c) shifted ventrally. Character-state (c) is unique for Aeypornis. Bledsoe (1988) 
described the humeral head as being shifted dorsally in Dromaius, Rhea, and 
Struthio. In the specimens examined, the humeral head was essentially near the 
midline for all taxa. 

21. Shape of shaft in cross-section (humerus): (a) elliptical proximally and 
distally, (b) circular proximally and distally, (c) triangular proximally and dis- 
tally, (d) triangular proximally, elliptical distally. Variation in the specimens ex- 
amined was sufficiently great to blur distinction among the character-states. 

22. Distal end (humerus): (a) widest cranially, (b) widest caudally, (c) cra- 
nial, medial, and caudal widths subequal, (d) widest medially. The specimens 
examined in this analysis displayed insufficient distinct variation from character- 
state (a), the primitive condition. 

26. Ulna, proximal end: (a) broad in proximal view, (b) narrow in proximal 
view. Bledsoe (1988) described Casuarius, Dromaius, and Apteryx as being nar- 
row in proximal view, but the specimens examined here all appeared to be 
broad in proximal view and not distinct from the other taxa. 

27. Ulna and radius: (a) unfused, (b) fused. Bledsoe (1988) described Cas- 
uarius as polymorphic and Dromaius as having character-state (b). All of the 
specimens examined for this analysis have unfused ulnae and radii. 

28. Width of shafts of ulna and radius: (a) ulna broader, (b) subequal. Bled- 
soe (1988) described Struthio and Casuarius as having the primitive condition in 
which the ulna is broader than the radius. The specimens examined in this 
study displayed an ulna that varied along the length of the bone, thus the dis- 
tinctness of the character-states is unclear. 

32. Fusion of os metacarpale majus and minus: (a) incomplete, (b) com- 
plete. Bledsoe (1988) described Apteryx, Casuarius, and Dromaius as having the 
fusion complete. This study found the fusion, reduction, or loss of these ele- 
ments to be unique for each taxon and thus highly variable across ratites. Both 
the majus and minus are present in Struthio and articulate at the distal end, but 
are often not entirely fused. In Casuarius, the majus and minus frequently are 
not extended completely distally in which case they are neither fused nor ar- 
ticulating; however, in some specimens they are bridged by a small piece of 
bone and they become “fused.” In Dromaius the minus cannot be found owing 
to a loss of the bone or fusion with the majus, and in Apteryx the minus is pres- 
ent, but it is not fused with the majus distally. The condition in Rhea resembles 
that in the Tinamidae and Megapodiidae in which the majus and minus are 
fused distally, although in some specimens fusion is essentially lacking. 

33. Os metacarpale majus: (a) wide dorsoventrally, (b) compressed dorso- 
ventrally. Too much variation occurred across the specimens examined to dis- 
tinguish character-states. 

41. Preacetabular tuberculum: (a) elongate, (b) short, (c) absent. Character- 
state (c) is autapomorphic for the dromornithids. Bledsoe (1988) coded Casua- 


7 Phylogeny of Ratites 211 


rius, Dromaius, and Rhea as having a short tuberculum, but so much variation 
occurred among the specimens examined that no distinct character-state could 
be determined. 

43. Caudal margin of proximal antitrochanteric articular surface (femur): 
(a) highly concave, (b) moderately concave, (c) straight or nearly so. The speci- 
mens examined had variation that was indistinguishable from the ancestral 
condition. 

45. Relationship of longest axis of shaft of femur to longest axes of medial 
and lateral condyles: (a) parallel or nearly so, (b) divergent by 15 degrees or 
greater. The two character-states could not be distinguished. 

47. Dorsal margin of lateral condyle (femur): (a) straight or nearly so, 

(b) moderately concave dorsally, (c) highly concave dorsally. Because of so 
much variability no distinct character-states could be distinguished. 

55. Channeling at margins of intercondylar eminence (tibiotarsus): (a) pres- 
ent, (b) absent. Because none of the specimens examined seemed to display an 
intercondylar eminence (character 56, see below), the presence of channeling at 
the margins was considered to be ambiguous. 

56. Intercondylar eminence (tibiotarsus): (a) present, (b) absent. None of 
the specimens in this analysis displayed a structure that might be distinguished 
as an intercondylar eminence. 

60. Craniodistal margin of lateral condyle (tibiotarsus): (a) semicircular, 

(b) elliptical. In the specimens examined, variation was too great to discern 
character-states. 

64. Width and length of medial condyle (tibiotarsus): (a) moderate, 

(b) moderate in width, short in length, (c) very wide, very short. This study 
revealed too much variation to individuate discrete character-states. 

71. Depression between intercotylar area and hypotarsus (tarsometatarsus): 
(a) present, (b) absent. Bledsoe (1988) described the depression as being absent 
in Struthio and the Dinornithidae, whereas in this analysis, the depression was 
found to be present in those taxa so there is no difference from the ancestral 
condition. 

78. Medial and lateral margins of trochlea III (tarsometatarsus): (a) not par- 
allel, (b) parallel. No discrete states could be recognized in the specimens 
examined. 

80. Proximodistal lengths of proximal phalanges: (a) III longest, IV shortest, 
or Ш and IV subequal, (b) III longest, (c) II longest, (d) П and II subequal, IV 
shortest. INo distinct character-states could be determined owing to high vari- 
ability in the specimens examined. 
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I. INTRODUCTION 


Well-corroborated phylogenies are crucial to understanding the evolution of the 
molecular, morphological, and behavioral differences among modern birds. How- 
ever, our understanding of phylogenetic relationships among and within avian or- 
ders is incomplete. Progress has been made, but our knowledge remains limited, 
owing in part to the apparent rapid radiation of modern birds. As a result, there are 
few shared derived characters, still conserved, delineating basal groups. Among the 
many challenges facing avian systematists is the need for additional discrete character 
data sets and knowledge of the constraints influencing character change over time. 
Our objective in this chapter is to present phylogenetic analyses of new molecular 
sequence data for select avian lineages, to place these analyses in the context of 
existing phylogenetic hypotheses, and to discuss pertinent issues regarding methods 
of phylogenetic inference. 

Most researchers recognize two primary groups of extant birds: (1) Paleogna- 
thae, which usually includes ratites (Struthio, Rhea, Casuarius, Dromaius, Apteryx) 
and nine genera of tinamous, and (2) Neognathae, which includes all other birds 
(Cracraft, 1981; Cracraft and Mindell, 1989; Olson, 1985; Sibley and Ahlquist, 
1990). Galliformes (pheasants, megapodes, curassows) and Anseriformes (water- 
fowl, screamers) are supported as sister taxa by diverse data sets, although their place- 
ment relative to Paleognathae and Neognathae remains unclear. Beyond this basal 
split between paleognaths and neognaths, however, there is little agreement among 
data sets or researchers on relationships among avian orders. 

Ordinal phylogenies based on morphological characters have many polytomies 
deep within them (e.g., Cracraft, 1988) because few derived morphological char- 
acters have been recognized that unite particular avian orders in sister-group rela- 
tionships. The only comprehensive ordinal phylogeny based on molecular data is 
from DNA- DNA hybridization analyses (Sibley and Ahlquist, 1990), and although 
significant insights have been gained, methodological and empirical problems re- 
main. Many of the branches uniting orders are short (reflecting small A T4, H values), 
and in many instances differences іп АТ, Н values between replicate experiments 
(with identical taxa) exceed differences between experiments involving different 
taxa. This suggests that the DNA-DNA hybridization data аге not well suited 
for divergences as large as those among many avian orders. Other difficulties with 
the Sibley and Ahlquist analyses pertain to violated assumptions of equivalence in 
genome size, gene arrangement, and AT:GC ratios; undesired hybridization and 
comparison of paralogous sequences, owing to mixing of single-copy and low- 
copy number genes; significant technical difficulties in measuring "normalized per- 
cent hybridization"; unenumerated corrections for evolutionary rate heterogeneity 
across taxa; and extrapolation of AT;,H values to an unseen portion of a curve 
whose shape must be estimated. The lack of complete data matrices (ATs, Н values 
for pairwise comparisons of all taxa), understandable given the labor involved, is also 
problematic. Many species are compared to a single reference taxon (a radiolabeled 
tracer) and these many species comparisons are then "chained" together accord- 
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ing to their increasing distance from the reference taxon. Thus, all species are not 
direcdy compared to each other as is usual in phylogenetic analyses, and heteroge- 
neity in rates of change can have additional significant effects on the inferred clus- 
tering. Also, analyses of AT;,H values yield unrooted trees that may, in the ab- 
sence of a specified outgroup, have multiple phylogenetic interpretations consistent 
with them (Cracraft, 1987; Houde, 1987; Gill and Sheldon, 1991; Mindell, 1992; 
Lanyon, 1992). 

Existing phylogenetic hypotheses for our current study taxa (Table I), based 
on both molecular and morphological data, do indicate a close relationship be- 
tween Procellariiformes (albatrosses, petrels, shearwaters) and Pelecaniformes (gan- 
nets, cormorants, pelicans); however, the relative position of Ciconiiformes (herons, 
storks, ibises) varies (e.g., Cracraft, 1988; Sibley and Ahlquist, 1990). Phylogenetic 
placement of "Turniciformes" (or Turnicidae; buttonquails or hemipodes) is poorly 
known. Falconiformes (falcons, hawks, eagles) and Strigiformes (owls) are variously 
placed as sisters or as distant relatives, with the latter requiring independent evolu- 
tion ofa general raptorial morphology and natural history in the two groups. Place- 
ment of Passeriformes (songbirds), Cuculiformes (cuckoos), Caprimulgiformes 
(goatsuckers, nightjars), Apodiformes (swifts), Trogoniformes (trogons), Coracii- 
formes (rollers, kingfishers), and Charadriiformes (shorebirds, gulls, auks) relative to 
other orders is similarly uncertain. Discussion of the 20-30 commonly recognized 
avian orders is complicated by the fact that monophyly for the orders as generally 
configured (e.g., Peters, 1931—1951; Wetmore, 1960; Mayr and Cottrell, 1979) 
cannot be presumed. Monophyly for Ciconiiformes, Gruiformes, Falconiformes, 
Pelecaniformes and Cuculiformes has been particularly contentious (Cracraft, 1981, 
1982; Olson, 1985; Sibley and Ahlquist, 1990; Sheldon and Bledsoe, 1993; Hedges 
and Sibley, 1994; Sibley, 1994; Hedges et al., 1995; Hackett et al., 1995). Discussion 
of ordinal relationships here is intended to focus on noncontroversial members of 
those orders, unless otherwise indicated. 


П. ISSUES IN PHYLOGENETIC ANALYSES 
A. Constraints on Molecular Evolution 


If we presume the existence of (1) a natural, divergent hierarchy of species based on 
common descent and (2) identifiable shared— derived characters (homologies) for 
taxa within the hierarchy, then discovery of monophyletic groups appears to be a 
straightforward task. Preference for a parsimony criterion in this discovery process 
derives from the general scientific practice of minimizing ad hoc assumptions (of 
homoplasy in the case of phylogenetic analyses) and the notion that parsimony cor- 
rectly determines which phylogenetic hypothesis is best supported by the character 
evidence (Farris, 1983; Nelson, 1994). Ideally, one need not invoke any specifics of 
evolutionary process to estimate genealogy. 

However, putative homologies can be difficult to identify. This may be attrib- 


TABLEI Study Species and Orders Represented“ 


Order Species Order Species 
Struthioniformes | Rhea americana** Gruiformes Fulica atra 
Struthio camelus (Титіх varia) 
Tinamiformes Crypturellus undulatus Charadriiformes Scolopax minor 


Procellariiformes Diomedea nigripes Pterocles coronatus 


Pelecaniformes Phalacrocorax pelagicus Cuculiformes (Tauraco hartlaubi)* 
Coccyzus erythopthalmus 


Ciconiiformes Mycteria americana ; А 
Opisthocomus hoazin 


Strigiformes Tyto alba 
Nyctea scandiaca* 


Nyctanassa violacea 
(Phoenicopterus ruber) 


Falconiformes Accipiter superciliosis* LEE 
Otus longicornis 


Circus aeruginosus* Gis iras 
Circaetus gallicus* 
Gyps fulvus* 
Buteo buteo* 


Buteo jamaicensis* 


Otus mindorensis 
Otus megalotis everetti * 
Otus megalotis nigrorum 


А А Mimizuku gurneyi * 

Milvus migrans* VES 
4 Bubo virginianus 

Haliaeetus leucocephalus* ; * 
Asio flammeus 


Gampsonyx swainsonii 9 : 
psony Aegolius acadicus* 


Pernis apivorus* : a : 
Р Ninox philippensis* 


Pandion haliaetus* 


Sagittarius serpentarius* Caprimulgiformes Chordeiles minor** 





Falco peregrinus* * Apodiformes Chaetura cinereiventris* 
Anseriformes Anhima cornuta Trogoniformes Trogon melanurus 
Chauna chavena Coraciiformes Coracias caudata 
Anseranas semipalmata . І 
Passeriformes Vidua chalybeata* * 


Cygnus buccinator 
Cygnus atratus 
Anas formosa 

Anas platyrhynchos* 
Aix sponsa 

Aythya americana* * 


Motacilla cinerea 
Junco hyemalis 
Sturnella magna 
Cardinalis phoeniceus 
Lanius collurio 


Aser fossi Sayornis phoebe 

Branta sandvicensis (Nonavian reptiles, ^ Alligator mississippiensis* Ж 
Dendrocygna arcuata outgroups) Crocodylus porosus 
Dendrocygna bicolor 


Somateria fisheri 
Thalassornis leuconotus 


Galliformes Gallus gallus* * 
Coturnix coturnix 
Bonasa umbellus 
Meleagris gallopavo 


Phasianus colchicus 





“Ordinal assignments for species in parentheses are poorly known. The entire mt 12S gene was ana- 
lyzed for all species listed. Single asterisks denote species having an additional 518 bases of mt COI 
available for analyses. Double asterisks denote species having 12 mt protein-coding genes (all but ND6) 
and both mt rDNAs available for analyses. Mitochondrial 12S and COI sequences have been deposited 
in GenBank with accession numbers U83709 —U83787 and U86138—U86142. Sequences for Gallus 
gallus and Coturnix coturnix are from Desjardins and Morais (1990 and 1991, respectively). 
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uted in part to the existence of a finite number of character states (four in the case 
of DNA) and rates of change sufficient to yield independent expressions of the same 
state. Putative homologies, like putative relationships among taxa, are products of 
phylogenetic analyses; attempts to improve them must, therefore, involve refine- 
ments in phylogenetic analyses. Such refinement has long been sought in the use of 
conservative characters. By giving greater weight in phylogenetic analyses to char- 
acters changing less frequently, confounding effects of homoplastic similarity can be 
reduced. Thus, improved understanding of rates of molecular evolution can poten- 
tially improve phylogenetic analyses. 

Primary influences on rates of molecular evolution may be viewed as constraints 
on mutation and fixation (Mindell and Thacker, 1996). Variation in mutation rate 
can stem from differences in replication frequency, replication repair (different 
mechanisms and different enzymes), replication fidelity, exposure to mutagens 
(especially DNA-damaging oxygen free radicals), and the initial conditions of dif- 
ferential codon and nucleotide base composition for genes and taxa (see Britten, 
1986; Shigenaga et al., 1989; Ohta, 1993). These factors and others may influence 
variation in both the overall rate and the rate for particular kinds of change. The 
rate at which one or more mutations become fixed is influenced by a set of con- 
tinuous and overlapping constraints, including the genetic code, secondary or ter- 
tiary structure, gene function, population size, frequency of cladogenesis, and natu- 
ral selection. Variation in the nature of these constraints across taxa and over time 
contributes to patterns of variation in rates of molecular sequence evolution for 
different taxa and different kinds of character changes. Such patterns, however, must 
be considered as hypotheses to be judged on their individual merits for any set of 
taxa or genes in which they are proposed. Patterns should not be assumed a priori. 

Given information on relative rates of change and variability for different char- 
acters, unequal weighting may be used to reduce levels of homoplasy (convergent 
similarity) in the data (Swofford et al., 1996; Mindell and Thacker, 1996). Empha- 
sizing change in relatively conserved, or slowly changing, characters is one of the 
oldest principles in systematics (Darwin, 1859; Hennig, 1966; Farris, 1966), al- 
though the means for doing this remain controversial. Character weights based on 
phylogenetically determined measures of homoplasy (Farris, 1969; Goloboff, 1993) 
make no claims independent of phylogeny about the information content of char- 
acters. Other weighting schemes, whether based on measures of character compati- 
bility (Penny and Hendy, 1985; Sharkey, 1989) or, as is more common, various 
estimates of comparative absolute rate (e.g., Mindell et al., 1991; Honeycutt and 
Adkins, 1993) or frequency of cooccurrence of alternative states at homologous sites 
(Knight and Mindell, 1993; Wheeler, 19902), do make claims that are not derived 
from character distributions on a phylogenetic hypothesis. However, the effects 
of fundamental physical chemical constraints, such as the genetic code, Watson- 
Crick base pairing, and the initial base composition, do not rely on any particular 
evolutionary theory. Rather, they circumscribe physical limitations on sequence 
character change across organisms, such that not all character changes are equally 
probable. Several studies indicate the success of unequal character weighting in re- 
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covering "known" phylogenies, whether those phylogenies are well corroborated 
and "noncontroversial," simulated, or result from manipulation of populations in a 
laboratory (Allard and Miyamoto, 1992; Atchley and Fitch, 1991; Hillis et al., 1994; 
Huelsenbeck, 1995; Miyamoto et al., 1994; Smith, 1994). Mindell and Thacker 
(1996) demonstrated that for a set of diverse vertebrate taxa, relative rates of evolu- 
tion estimated for various characters were similar whether based on discrete char- 
acter branch lengths from a phylogenetic tree or pairwise comparisons. 


B. Sampling of Characters and Taxa 


Considerable debate has focused on two issues related to sampling in phyloge- 
netic analyses: (1) the relative merits of partitioning and combining data sets, and 
(2) taxonomic versus character congruence (Miyamoto, 1985; Cracraft and Min- 
dell, 1989; Kluge, 1989; Bull et al., 1993; Eernisse and Kluge, 1993; Chippindale 
and Wiens, 1994; deQueiroz et al., 1995). What can be called a "total data" ap- 
proach accords equal weight to all characters regardless of their degree of homo- 
plasy. An alternative “total evidence" approach (in our usage of that term) considers 
all characters as potentially informative, but can use successive approximations (Far- 
ris, 1969) and character analyses yielding rate estimates as sources of evidence in 
determining the relative information content of character sets. Weights may be ap- 
plied in alternative phylogenetic analyses accordingly. In this view, total evidence 
includes the characters themselves and our improving understanding of their history 
of change. This is in keeping with the view that theory and practice of systematics 
can be mutually informing and that informed unequal weighting can improve phy- 
logenetic hypotheses. The total data approach with equal weighting of all characters 
may be appropriate in the absence of evolutionary rate heterogeneity, or if homo- 
plasy can be demonstrated to be randomly distributed for a particular set of taxa 
and characters. However, the constraints on rates of molecular evolution discussed 
above, varying across taxa and characters sets, tend to impose decidedly nonrandom 
effects on patterns of homoplasy accumulation. 

Numerous studies suggest that larger data sets perform better in recovery of a 
well-corroborated tree or a known tree based on simulation or laboratory manipu- 
lation, and a threshold for number of DNA characters may exist at which phylo- 
genetic analyses recover well-corroborated, “noncontroversial” sister relationships 
with equal weighting of most characters. Parsimony trees from random samples 
of 7000 equally weighted mitochondrial DNA characters (excluding the rapidly 
evolving D-loop region) yielded the same tree for 10 taxa, as did the whole mito- 
chondrial genome about 9096 of the time (Cummings et al., 1995). In a numerical 
simulation study of four taxa with equal rates of evolution, Hillis et al. (1994) found 
that equally weighted parsimony analyses yielded the known tree 100% of the time 
with about 1500 bases. Parsimony analyses with unequal weighting required even 
fewer characters in recovering the known tree 100% of the time. These studies also 
suggest that larger numbers of taxa and (nonidealized) unequal rates of sequence 
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change, as well as the presence of short internodes, will tend to require larger 
amounts of sequence data. Larger sets of DNA characters are also preferable to 
smaller sets because they provide a more comprehensive test of congruence among 
characters and often provide phylogenies that are less sensitive to alternative char- 
acter weighting schemes, alternative sequence alignments (particularly of ribosomal 
DNA), and alternative inclusion /exclusion of study taxa, including outgroups (e.g., 
Kluge, 1989; Wheeler, 1995; Mindell and Thacker, 1996). This is reflected in phy- 
logenies having high levels of statistical support based on mammalian whole mito- 
chondrial genomes and lower levels of support based on a single or small number of 
genes (see Cao et al., 1994a,b; Graur and Higgins, 1994; Janke et al., 1994). 

Similarly, more taxa are generally preferable to fewer taxa. Additional taxa can 
reduce the incidence of long branches within a tree, and so reduce the potential for 
attraction among them owing to convergent similarity. As more closely related taxa 
are included, more of the multiple substitutions at individual nucleotide positions 
may be recovered. However, the addition of single representatives of distantly re- 
lated taxa, with long branches, may have the opposite effect of increasing the level 
of convergent similarity within the data set. These issues are particularly relevant to 
selection of outgroup taxa. Analyses by Smith (1994) demonstrate that parsimony 
recovers a well-corroborated tree more often when multiple outgroup taxa repre- 
senting a single sister group, with relatively few long branches, are used, rather than 
multiple distantly related outgroup lineages. 


III. METHODS 
A. Study Characters and Taxa 


We provide analyses of three overlapping mt DNA sequence data sets. Our largest 
character data set consists of 13,298 nucleotide bases from 12 mitochondrial (mt) 
protein-coding genes (all but ND6) and both ribosomal (rDNA) genes from each 
of six taxa: greater rhea (Rhea americana), domestic chicken (Gallus gallus), redhead 
duck (Aythya americana), peregrine falcon (Falco peregrinus), village indigobird (Vidua 
chalybeata; Passeriformes), and Alligator mississippiensis. We present analyses of mt 128 
rDNA, comprising 859 aligned nucleotide positions, for 72 species of birds repre- 
senting 18 different orders as traditionally configured (Table I). We also present 
analyses of 518 nucleotide positions of mt COI for a subset of the study taxa. Our 
12S rDNA analyses include multiple species within eight orders, as well as several 
taxa whose ordinal relationships have been particularly difficult to resolve: the 
hoatzin (Opisthocomus hoazin), a buttonquail (Turnix varia), and a flamingo (Phoent- 
copterus ruber). 

Mitochondrial 12S rDNA and COI sequences for most falconiform and strigi- 
form taxa (n=20) were determined using standard polymerase chain reaction (PCR) 
amplification techniques and manual sequencing with CircumVent (New England 
BioLabs, Beverly, MA), and direct incorporation of * S-labeled dATPs (see Knight 
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and Mindell, 1993). Sequences for all other taxa in Table I, including the six with 
the largest set of sequence characters, were determined using an Applied Biosystems 
(ABI; Foster City, CA) 377 automated sequencer. To maximize efficiency and to 
generate greater overlap between contiguous sequences for the six taxa having 13 
kb sequenced, long PCR products were generated with an rTth DNA polymerase- 
based XL-PCR kit (Perkin-Elmer, Norwalk, CT) and, after gel purification, se- 
quenced directly with both the PCR primers and multiple internal primers. The 
PCR products were gel purified and then extracted from agarose with a QIAquick 
gel extraction kit (Qiagen, Chatsworth, CA). An aliquot ofthe purified PCR prod- 
uct was run on a minigel and compared to marker bands with known quantities of 
DNA to estimate DNA concentration. Approximately 30—150 ng of PCR product 
(depending on the size of the PCR fragment) was used in each sequencing reaction. 
Sequencing reactions used the ABI Prism dye terminator cycle sequencing ready 
reaction kit with AmpliTaq DNA polymerase FS (Perkin-Elmer). Unincorporated 
fluorescent labeled ddNTPs were removed using a Sephadex (G-50-Fine; Phar- 
macia, Piscataway, NJ) spin column. Sequencing reactions were run on a Long 
Ranger (FMC, Philadelphia, PA) acrylamide gel. Careful interpretation of the ma- 
chine output, sequencing and reconciling of both DNA strands, and repetition of 
reactions when necessary yield accurate sequence. Interpretation and checking of 
machine output was completed by use of the ABI Sequence Navigator program, 
which allows chromatograms to be aligned, compared, and edited on screen. 

Insertions of mtDNA sequences into the nuclear genome have been documented 
in a wide variety of taxa and now appear to be a common phenomenon (Zhang 
and Hewitt, 1996; Quinn, Chapter 1 in this volume). We are keenly aware of the 
potential problems nuclear homologs may cause in the accurate determination of 
mtDNA sequences as well as the potential errors introduced by their unwitting 
inclusion in phylogenetic analyses. Extracts of total genomic DNA from blood 
(birds have nucleated red blood cells) were used in previous published reports of 
nuclear insertions in birds (Quinn, 1992; Arctander, 1995; Sorenson and Fleischer, 
1996), and no blood samples were used in the current study. Additional precautions 
in our study include (1) the amplification of long fragments (although a continuous 
nuclear ex-mtDNA sequence of 7.9 kb has been documented in the cat, most are 
much smaller in length; Zhang and Hewitt, 1996), (2) the inclusion of degenerate 
positions in primers complementary to protein-coding regions (this reduces the 
possibility that primers will preferentially amplify a low copy number nuclear se- 
quence owing to primer mismatch with the mtDNA), (3) broadly overlapping 
PCR products (it is unlikely that two different sets of PCR pairs will both amplify a 
nuclear sequence from an mtDNA-rich sample), and (4) careful examination of 
sequences for double peaks (resulting from coamplification of both mtDNA and 
nuclear sequences), unexpected insertions or deletions (indels), and frameshifts or 
stop codons. 

Alignments for all mitochondrial genes were done using CLUSTAL W (Thomp- 
son ef al., 1994) and adjustments by eye. For 12S rDNA the initial alignments were 
overlaid on a secondary structural model (see Fig. 8.1), and adjustments were sought 
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FIGURE 8.1 Hypothetical mt 12S rRNA secondary structure for peregrine falcon (Falco peregrinus). 
The structure is based on previously published secondary structure models (Gutell, 1994; Van de Peer 
et al., 1994; Sullivan et al., 1995; Hickson et al., 1996) with identification of structures aided by compen- 
satory base changes across helices among the birds in our data set. Helices are numbered according to 
Van de Peer et al. (1994). Unconventional base pairings are indicated by filled circles for (G-U) and 
unfilled circles for (А-С). Regions of substantial length variation among birds are indicated by solid 
curved lines adjacent to the Falco peregrinus sequence, and correspond roughly to regions omitted from 
parsimony analyses. 
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that would (1) maintain alignment of both stems and nonstems across taxa, and 
(2) maintain base pairing within stems (see Kjer, 1995). Hypervariable regions, ap- 
pearing to be randomly aligned, which could not be improved by eye, were omitted 
from phylogenetic analyses below. For mt 12S rDNA, 242 base positions were 
omitted from the total alignment length of 1101 positions for the study taxa (Table 
D), yielding 859 base positions for analyses. For the mt 16S rDNA available for six 
taxa, 171 base positions were omitted from the total alignment length of 1671 po- 
sitions, based on comparisons to published secondary structural hypotheses (Gutell 
et al., 1993). We have not refined a 16S rDNA model for birds. 

The hypothetical mt 12S rRNA secondary structure for Falco peregrinus in 
Fig. 8.1 1s based on previously published secondary structure models (Gutell, 1994; 
Van de Peer et al., 1994; Sullivan et al., 1995; Hickson et al., 1996) with identifica- 
tion of structures aided by compensatory base changes across helices among the 
birds in our data set. Helix 8 is particularly variable in the number of base pairs in 
its distal portion. Base pairing for helix 26 and the adjacent ends of helices 24 and 
25 were difficult to determine in many taxa, suggesting structural variability among 
species. Falco has four extra bases between helices 24 and 25, unlike any of the other 
birds in our data set. Additional avian examples for helices 8 and 24—26 are shown. 
The proposed 5-bp helix between helices 23 and 24 is also unusual in Falco. None- 
theless, most of the birds in our data set have at least two or three bases of comple- 
mentary sequence in positions homologous to the distal 2-bp in Falco. In most other 
taxa, this region has been represented as two long single-stranded sequences (e.g., 
Houde, Chapter 5 in this volume). We have drawn helix 38 as іп Hickson ег al. 
(1996): an alternative structure is given by Houde (Chapter 5 in this volume). In- 
creasingly refined models of secondary structure as in Fig. 8.1 are useful as templates 
for informing alignments and character homology hypotheses for conserved struc- 
tural features. 

Although a complete discussion of patterns of sequence variation in relation to 
secondary structure is beyond the scope of this study, we suggest that a weighting 
scheme based on a simple categorization of 12S rDNA sequence characters into 
helices, bulges, loops, and other unpaired regions on the premise that these struc- 
tures evolve at different rates would make unrealistic assumptions. Whereas the 
most variable regions are generally terminal loops, other loops are highly conserved. 
Likewise, certain helices have high rates of base substitution, while others show little 
if any variation across taxa. In addition, some helices (e.g., helix 8) appear to vary 
in length and to shift slightly in position. 


B. Phylogenetic Analyses 


Phylogenetic analyses were conducted using the criteria of parsimony and congru- 
ence among characters. Given the large numbers of taxa and characters, heuristic 
searches for shortest trees were conducted with 1000 replicates, using starting to- 
pologies based on random addition of taxa to reduce the possibility of finding a 
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FIGURE 8.2 Number of substitutions for nucleotide pairs versus overall percent divergence for the 
mitochondrial 12S rDNA gene from 50 birds (same species as in Fig. 8.5), plus Crocodylus porosus, Homo 
sapiens, Mus musculus, and Didelphis virginiana. Lines were drawn using logarithmic curves for T—C and 
A-G substitutions and straight lines for the other substitution types, based on their fit to plotted points 
(not shown). Substitution tallies have been standardized to account for differences in base composition, 
by dividing the observed number of substitutions for nucleotide pairs by the mean ratio of observed to 
expected (25%) frequencies. 


local parsimony optimum rather than the universal optimum, using PAUP (Swof- 
ford, 1993). Exhaustive searches for the most parsimonious tree were conducted for 
smaller subsets of taxa to focus on particular taxa and test alternative existing hy- 
potheses regarding their phylogenetic relationships. Support indices are calculated 
for nodes within most parsimonious trees to denote their degree of character sup- 
port (Bremer, 1988) where all characters have been equally weighted. They indi- 
cate the number of additional steps required for the shortest tree lacking the par- 
ticular clade. 

To inform character weighting in mt 12S rDNA analyses, we compare inferred 
rates of substitution among the 6 possible nucleotide pairs in all pairwise compari- 
sons among 50 birds, a crocodilian, and 3 mammals. As expected, we find transi- 
tions to be more frequent than transversions and a decreasing frequency for Т-С, 
A-G, A-C, А-Т, T-G, and C-G substitution types (Fig. 8.2). We also plot mean 
percent divergence among these same 12S sequences against estimated time since 





divergence for the taxa compared (Fig. 8.3). Divergence time estimates based on 
fossils include dates of 310 million years ago (MYA) for the split between mammals 
and reptiles/ birds and 245 MYA for the split between birds and crocodilians (Ben- 
ton, 1990). Divergence times among avian orders are poorly known, although they 
are likely similar for most orders given an apparent rapid radiation of forms. Current 
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FIGURE 8.3 Mean percent divergence in mitochondrial 12S rDNA sequence versus estimated time 
since divergence (ETSD) for all species pairs for 50 birds (same species as in Fig. 8.5), Crocodylus porosus, 
Homo sapiens, Mus musculus, and Didelphis virginiana. Divergence time estimates based on fossils include 
dates of 310 million years ago for the split between mammals and reptiles/ birds and 245 MYA for the 
split between birds and crocodilians (Benton, 1990). Divergences for avian orders were set to 80 million 
years ago and divergences among Accipitridae and Strigidae species were set to 30 million years ago (see 
text). Substitution tallies have been standardized to account for differences in base composition as in 
Fig. 8.2. TIs and TVs, Transitions and transversions, respectively. 


estimates for extant avian ordinal ages range from more than 100 MYA based on 
molecular data (Sibley and Ahlquist, 1990; Hedges et al., 1996) to less than 60 MYA 
based on fossil evidence (see Feduccia, 1995), and we have used an estimate of 80 
MYA for divergences among avian orders in Fig. 8.3. We use an estimate of 30 MYA 
for divergences among species within the avian families Accipitridae and Strigidae, 
consistent with the oldest known fossils for these taxa (Brodkorb, 1964; Olson, 
1985; Carroll, 1988). 

Figures 8.2 and 8.3 indicate that transitions are more likely to entail convergent 
similarities than are transversions for analyses of avian orders in our study set, and 
for our analyses of single representatives from diverse orders we use transversions 
only. For analyses mixing orders with both single and numerous representative taxa 
we use two sets of weights in alternative analyses, (1) transversions and transitions 
weighted equally and (2) a weighting ratio of 5:1, respectively. Mitochondrial 
DNA exhibits diverse rates of evolution, varying both among and within genes, 
gene regions, and across taxa. Thus, statements regarding rate must be viewed as 
simplified hypotheses specific to the data set on which they are based. 

Differences among taxa in base composition can contribute to increased amounts 
of convergent similarity and confound phylogenetic analyses. However, this is ame- 
liorated by focusing on relatively conserved, nonsaturated types of character change. 
12S rDNA mean percent base composition for 71 avian species (Table I) is A: 31.1, 
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TABLE II Parsimony-Based Relative Rate Tests among Eight Birds* 





1 2 3 4 5 6 7 
1. Struthio 
2. Crypturellus 8/23** 
3. Anser 7/10 26/14* 
4. Meleagris 12/11 29/13** 14/10 
5. Turnix 6/14 20/13 6/11 10/19 
6. Motacilla 9/26** 20/22 11/25* 10/28** 12/21 
7. Sagittarius 5/19** 20/19 9/20* 9/24** 10/16 19/17 
8. Otus 11/27** 25/26 11/24* 16/33* 14/22 26/25 20/22 


"Tests are based on mt 12S rDNA transversions only using Crocodylus acutus as an outgroup (Mindell 
and Honeycutt, 1990). Numbers denote unambiguous, autapomorphic character changes (column 
taxon/row taxon) based on branch lengths for a series of three-taxon trees using MacClade (Maddison 
and Maddison, 1992). *, P < 0.05; **, P < 0.01. See Table I for full species names (Otus megalotis everetti 
is used here). 


C: 26.9, G: 22.5, T: 19.5. A chi-square test of homogeneity of base frequencies 
across taxa found no significant differences. Similar analyses of relative rate were 
conducted for mt COI sequences from a subset of the avian study taxa, and the 
observed patterns, with higher rates of change at third-codon positions compared 
to first and second positions, are consistent with studies of other protein-coding mt 
genes in vertebrates (see Fig. 2 in Mindell and Thacker, 1996). Mitochondrial COI 
mean percent base composition for 27 avian species (Table I) is A: 25.9, C: 31.8; 
G: 16.5, T: 25.8. Again, a chi-square test of homogeneity of base frequencies across 
taxa found no significant differences. 

We compared relative rates of mt 12S rDNA sequence change among the study 
taxa, using a parsimony-based approach (Mindell and Honeycutt, 1990; Mindell 
et al., 1996), and present a subset of the significant comparisons in Table П. Taxa 
showing faster rates of change relative to others include a tinamou (Crypturellus), a 
passeriform (Motacilla), secretary bird (Sagittarius), and a strigiform (Otus). Other 
passeriform and strigiform study taxa also show relatively fast rates, although the 
comparisons vary among species. Relative rate comparisons help in identifying 
potential long branch attraction problems, particularly for lineages represented by 
single species. 


IV. RESULTS AND DISCUSSION 


A. Phylogenetic Placement of Anseriformes 
and Galliformes 


Our largest character data set includes 13,298 mtDNA characters for 5 birds and a 
crocodilian outgroup. Single most parsimonious trees based on all characters equally 
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FIGURE 8.4 Single most parsimonious trees based on exhaustive searches using 13,298 nucleotide 
base positions from 12 mitochondrial protein-coding genes (only ND6 is missing) and the 2 rDNA genes 
for each of 6 taxa, with (A) all characters given equal weight (11,151 steps) and (B) all characters weighted 
equally at codon positions 1 and 2 and with third codon position and rDNA transitions given zero weight 
(7584 steps). Alligator mississippiensis was designated as the outgroup. Numbers on branches denote: 
branch lengths /support indices (Bremer, 1988). The topology shown in (B) was also found to be most 
parsimonious in analysis of the corresponding amino acid sequences for 12 mt protein-coding genes 
using the PROTPARS weight matrix, and found to be optimal in maximum-likelihood analysis ofamino 
acid sequences using protML and the JTT model in the MOLPHY set of programs (Adachi and Hase- 
gawa, 1992). The basal position of a passeriform (Vidua) among primary lineages of extant birds is un- 
conventional (see also Fig. 8.5). (C) shows the ingroup topology from (B) as an unrooted network, 
illustrating sensitivity of topology to the root placement (indicated by an arrow). Placement of the root 
along the lineage leading to Rhea in (C) would yield the more conventional topology (Cracraft and 
Mindell, 1989; Sibley and Ahlquist, 1990) with the Aythya/Gallus clade being sister to the neognaths 
(Falco/ Vidua). Addition of sequences from more taxa within the avian ingroup (including passeriforms), 
and from more diverse crocodilians, may help reduce potential attraction among long branches, and 
further test the position of Passeriformes in future analyses. 


weighted (Fig. 8.4A) and codon positions 1 and 2 characters equally weighted with 
rDNA transversions only (Fig. 8.4B) differ only in the relative position of Aythya, 
an anseriform, being sister to Rhea in the first analysis and sister to Gallus in the 
second. We believe the second analysis to be a better phylogenetic estimate, in light 
of greater saturation of third-codon positions and rDNA transitions. Although the 
character set is smaller in Fig. 8.4B, the support index for the Gallus/Aythya node 
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(36) is substantially larger than for the Rhea/Aythya node (13) in Fig. 8.4A. Fig- 
ure 8.4B unites representative galliform and anseriform species as sisters, most re- 
cently sharing a common ancestor with Rhea, a paleognath. A falconiform is sister 
to that group, and a passeriform is sister to the clade including all four other birds. 
We also found the Fig. 8.4B topology in analyzing just the 12 mt protein-coding 
genes based on (1) both DNA (weighted as in Fig. 8.4B) and amino acids (using 
the PROTPARS weight matrix) in parsimony analyses with PAUP, and (2) using 
a maximum-likelihood approach with protML (version 2.2) and the JTT model 
( Jones et al., 1992) in the MOLPHY set of programs (Kishino and Hasegawa, 1990; 
Adachi and Hasegawa, 1992). Analysis of only the two mt rDNA genes placed the 
root (Alligator) along the Gallus branch, yielding a tree with Gallus as basal among 
birds and with Rhea sister to Falconiformes/Passeriformes. 

The sister relationship for Galliformes and Anseriformes (Fig. 8.4B) is in agree- 
ment with both morphological and molecular studies; however, the sister relation- 
ship between a Galliformes/Anseriformes clade and paleognaths conflicts with pre- 
vious molecular and morphological analyses (Stapel et al., 1984; Cracraft, 1988; 
Cracraft and Mindell, 1989; Sibley and Ahlquist, 1990). Curiously, summary 
DNA-DNA hybridization distance analyses actually do indicate a sister relation- 
ship between paleognaths and Galliformes/Anseriformes (Sibley and Ahlquist, 
1990, Fig. 357); however, this was considered as (Sibley and Ahlquist, 1990, p. 255) 
"misplaced," and an arrangement with Galliformes/Anseriformes as the oldest 
neognath clade was said to be (Sibley and Ahlquist, 1990, p. 288) “the best repre- 
sentation of all the data, morphological and molecular." Stapel et al. (1984) found 
evidence placing Galliformes and Anseriformes in a clade with 14 neognath species 
based on 173 amino acids from the nuclear-encoded a-crystallin A gene. 

Placement of the root is critical in resolving the position of Anseriformes/Gal- 
liformes as paleognaths (as in our best estimate) or as neognaths. Root placement 
must be considered carefully, as distant outgroups in particular can lead to spurious 
rooting on the longest internal branch of the ingroup (Wheeler, 1990b; Smith, 
1994). Considering the ingroup topology in Fig. 8.4B as an unrooted network 
(Fig. 8.4C), we find sister relationships for Anseriformes and Galliformes and for 
the two neognaths, with Rhea attached to their internode. Attachment of the root 
(Alligator) to the passeriform (Vidua), as in Fig. 8.4B, yields a sister relationship for 
the Anseriformes/Galliformes clade and a paleognath, as well as the unexpected 
position of Passeriformes as basal to other birds. Attachment of the root to Rhea 
yields the conventional placement of Anseriformes/Calliformes as sister to all other 
neognaths (Cracraft and Mindell, 1989; Sibley and Ahlquist, 1990). However, if the 
root were placed along the galliform or anseriform lineages Rhea would be sister to 
the two neognaths. Long branch attraction involving a distantly related outgroup, 
as in our case, is a strong possibility, and we look for inclusion of additional ingroup 
and outgroup taxa to help bisect long branches in future studies. Nonetheless, the 
basal position of Passeriformes and sister relationship for Anseriformes/Galliformes 
and a paleognath (Fig. 8.4B) 1s the most parsimonious explanation for this large and 
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relatively conserved mt data set, and it is not our intention at present to dismiss this 
topology as a rooting anomaly. Mitochondrial 12S rDNA analyses discussed below 
strongly support Passeriform monophyly and a basal position for Passeriformes 
among an expanded set of neognath taxa. 


B. Relationships among Neognath Orders 


Rooting is also a critical issue for resolving relationships among traditional neognath 
orders. If Passeriformes is basal among birds, as suggested by Fig. 8.4B, then croco- 
dilians are the appropriate outgroup. However, use of such an early divergent (dis- 
tant) outgroup appears to be pushing the limits of historical informativeness for 
12S rDNA characters alone. 12S rDNA is better suited for analysis of relationships 
among traditional neognath orders if an avian outgroup is used, and we provide 
such analyses using two anseriform taxa as an outgroup (Fig. 8.5). Use ofanseriform 
taxa as an outgroup is inconsistent with our Fig. 8.4B topology, however, it is con- 
sistent with the conventional placement of Anseriformes/Galliformes as sister to all 
other neognaths (Cracraft and Mindell, 1989; Sibley and Ahlquist, 1990) and with 
the unrooted ingroup topology in Fig. 8.4C. 

Parsimony analyses of mt 12S rDNA for all characters using two anseriforms 
as an outgroup for 48 birds representing 15 traditional orders were conducted us- 
ing 1000 replicate searches with random addition of taxa in each replicate. Equal 
weighting for all characters yielded 10 equally parsimonious trees whose strict con- 
sensus resolved only 4 sets of ordinal relationships. Opisthocomus is sister to Tauraco, 
Trogon is sister to Chordeiles, Falco is sister to Tyto, and Passeriformes are sister to all 
the other neognath orders combined (Fig. 8.5A). Weighting transversions:transi- 
tions as 5:1 yielded a single tree also showing Passeriformes as basal to the other 
neognath orders, but showing differences from the equally weighted tree for the 
other taxa mentioned above (Fig. 8.5B). 

We consider the 5:1 tree a better representation of the phylogenetic signal 
within the 12S character set, given greater levels of convergent similarity in transi- 
tions (Figs. 8.2 and 8.3). The small number of resolved nodes in Fig. 8.5A further 
indicates reduced phylogenetic informativeness. Relationships among falconiform 
and strigiform taxa and within the families Accipitridae and Strigidae (similar in 
both trees) are similar to the few previous studies considering those taxa, and are 
discussed in following sections. Separate analyses focusing on buttonquail, hoatzin, 
and flamingo are also discussed below. 

Placement of Passeriformes as basal to the other neognath taxa, based on both 
the large 14-gene mt data set (Fig. 8.4) and the 12S rDNA data set including more 
taxa (Fig. 8.5), has not been indicated in previous analyses, and was unexpected. 
The fossil record for passeriform birds dates back only to the upper Oligocene, 
about 25 MYA (e.g., Mourer-Chauviré et al., 1989), and the earlier appearance in 
the fossil record of other modern bird lineages (see Olson, 1985; Feduccia, 1996) 
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1:1 5:1 
Turnix varia Opisthocomus hoazin 
Opisthocomus hoazin Phoenicopterus ruber 
Tauraco hartlaubi Chordeiles minor 
Scolopax minor Sagittarius serpentarius 
Chaetura cinereiventris Pterocles coronatus 
Coccyzus erythopthalmus Scolopax minor 
Coracias caudata Diomedea nigripes 
Diomedea nigripes Tauraco hartlaubi 
Nyctanassa violacea Fulica atra 
Phalacrocorax pelagicus Chaetura cinereiventris 
Pterocles coronatus Coracias caudata 
Trogon melanurus Nyctanassa violacea 
Chordeiles minor Turnix varia 
Fulica atra Coccyzus erythopthalmus 
Pboenicopterus ruber Phalacrocorax pelagicus 
Mycteria americana Buteo buteo 
8 Buteo buteo Buteo jamaicensis 
4 Buteo jamaicensis Haliaeetus leucocephalus 
4 7j Haliaeetus leucocephalus Milvus migrans 
7 Milvus migrans Circus aeruginosus 
Circus aeruginosus Accipiter superciliosis 
Accipiter superciliosis Gyps fulvus 
Circaetus gallicus Circaetus gallicus 
Gyps fulvus Pernis apivorus 
Gampsonyx swainsonii Pandion haliaetus 
7 Pernis apivorus Gampsonyx swainsonii 
Pandion haliaetus Nyctea scandiaca 
3 Falco peregrinus Bubo virginianus 
Tyto alba Asio flammeus 
Sagittarius serpentarius Otus megalotis everetti 
13 Nyctea scandiaca Otus megalotis nigrorum 
Bubo virginianus Mimizuku gurneyi 
3 Asio flammeus Otus mirus 
3, Otus megalotis everetti Otus mindorensis 
10 Otus megalotis nigrorum Otus longicornis 
14 Mimizuku gurneyi Aegolius acadicus 
6 Otus mirus Ninox philippensis 
7 Otus mindorensis Trogon melanurus 
Otus longicornis Tyto alba 
Aegolius acadicus Falco peregrinus 
Ninox philippensis Mycteria americana 
3 Junco hyemalis Junco hyemalis 
2 Motacilla cinerea Motacilla cinerea 
п Sturnelia magna Sturnella magna 
14 Cardinalis phoeniceus Cardinalis phoeniceus 
Vidua chalybeata Vidua chalybeata 
Lanius collurio Lanius collurio 
Sayornis phoebe Sayornis phoebe 
Aythya americana Aythya americana 
Cygnus buccinator Cygnus buccinator 
FIGURE 8.5 Phylogenetic hypotheses based on 859 125 mt rDNA sequence characters from 48 (in- 


group) bird species representing 15 traditional orders, using Cygnus and Aythya as outgroup taxa. (A) 
Strict consensus of 10 equally parsimonious trees of 2020 steps based on equal weighting for all characters 
in 1000 replicate searches with random addition of taxa. (B) Single most parsimonious tree of 3760 steps 
based оп a weighting transversion-to-transition ratio of 5: 1 in 1000 replicate searches with random addi- 
tion of taxa. A maximum of 10 trees were saved and branch-swapped per replicate search. Numbers on 
branches denote support indices. 


argues against a basal position for Passeriformes among extant avian lineages. There 
remains, however, a lack of fossil evidence for the passerine radiation, and Boles 
(1995) has described two bones, dating back to 54 MYA, that closely resemble those 
of Passeriformes. Passeriform placement in our analyses may be influenced by faster 
rates of mtDNA sequence evolution among them. We found a tendency toward 
faster rates of 125 mt rDNA evolution in some Passeriformes compared to other 
birds (Table ID), and this could result in a long-branch attraction (Felsenstein, 1978) 
with Passeriformes drawn basally in phylogenetic hypotheses owing to greater 
convergent similarity at numerous sites with crocodilian (Fig. 8.4) or anseriform 
(Fig. 8.5) outgroups. Emphasis on relatively conserved, slowly evolving characters 
for both data sets, however, works to reduce long-branch attraction, and we cannot 
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assume that the basal position indicated for Passeriformes is an artifact of rate differ- 
ences. The position of Passeriformes in DNA-DNA hybridization analyses (Sibley 
and Ahlquist, 1990) well within their “Neoaves,” and sister to the group including 
their Columbiformes, Gruiformes, and Ciconiiformes, is based on a short internode 
and is less than conclusive. 

Phylogenetic placement of sandgrouse (represented by Pterocles) has been contro- 
versial, with most debate focusing on their alternative placement with Columbi- 
formes (pigeons and doves; e.g., Wetmore, 1960; Mayr and Amadon, 1951) or 
Charadriiformes (shorebirds, gulls, and alcids; e.g., MacLean, 1969; Sibley and 
Ahlquist, 1990). We find Pterocles placed as sister to a charadriiform (Scolopax) in 
Fig. 8.5B (and sister to a clade including Scolopax in Fig. 8.6); however, we do not 
have sequence data for a columbiform and cannot test a sister relationship directly 
for Pterocles and Columbiformes relative to Charadriiformes. Lack of resolution for 
Pterocles (and many other taxa) in Fig. 8.5A suggests that the Pterocles lineage is a 
relatively old one in which multiple substitutions at individual nucleotide positions 
have diminished historical signal based on all characters equally weighted. 

Various higher level relationships based on single species representatives (Diome- 
dia, Tauraco, Fulica, Chaetura, Coracias, Nyctanassa, Coccyzus, and Phalacrocorax) of 
different orders, indicated in Fig. 8.5B, are largely unexpected on the basis of pre- 
vious studies. Use of single species representatives for divergent orders is particularly 
subject to confounding effects of convergent similarity and misdiagnosis of derived 
characters as ancestral, and we are skeptical of the indicated relationships. More 
sampling of taxa is needed. The nonsister relationship of Tauraco (family Musophag- 
idae) and Coccyzus (family Cuculidae) in Fig. 8.5 is consistent with the view of 
polyphyly for traditional Cuculiformes and the findings of Sibley and Ahlquist 
(1990). 


C. Placement of Buttonquail, 
Hoatzin, and Flamingoes 


Phylogenetic relationships for buttonquail and for the hoatzin (Opisthocomus hoazin) 
have been controversial. They overlap in their combined set of hypothesized rela- 
tives, so will be considered concomitantly here. Buttonquail (Turnicidae), also 
known as hemipodes, are small, running birds, similar in appearance to Old World 
quail. They exhibit several unique features, including reversed sexual dimorphism, 
a tendency toward polyandrous nesting habits, and, in some species, lack of a hind 
toe. They have variously been placed as Galliformes, Gruiformes, or as the sole 
family within Turniciformes. Analyses by Sibley and Ahlquist placed them as basal 
to all neognaths. They viewed this as an artifact of evolutionary rate differences 
among taxa and suggested that buttonquail are as likely to be members of Gruifor- 
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mes as of any other order, although they placed the buttonquail in a separate order 
in their classification. 

The hoatzin is a slender, pheasant-like bird, with bare, blue skin on the face, that 
nests communally (with the aid of nest-helpers). Morphological features include a 
large two-part crop used in fermentation and digestion of cellulose, and the pres- 
ence of two claws on the wings of nestlings. The hoatzin has been variously placed 
together with turacos, cuckoos, and Galliformes (reviewed in Cracraft, 1981; Sibley 
and Ahlquist, 1990; Hedges et al., 1995). 

Our analyses of mt 12S rDNA emphasizing relatively conserved characters in- 
dicate that the hoatzin is most closely related to cuckoos (as represented by Coc- 
cyzus), and not turacos or Galliformes (Fig. 8.6). This is in agreement with other 
molecular analyses (Hedges et al., 1995; Sibley and Ahlquist, 1972, 1990). We find 
this inference based on both transversion parsimony and a transversions:transitions 
weighting ratio of 5:1. Equal weighting for all characters yielded the following 
strict consensus topology for three equally parsimonious trees with 589 steps: 


(((((Scolopax, Fulica), Pterocles), Turnix, Opisthocomus, Tauraco), Gallus), Coccyzus). 


Our analyses show a buttonquail (Turnix) to be more closely related to a grui- 
form (Fulica) than to a galliform. However, Turnix is sister to an Opisthocomus/Coc- 
cyzus clade, and not to the gruiform. Relative rate comparisons suggest that Turnix 
mt 12S rDNA has not been changing at a rate significantly different from that of 
other birds in our sample (e.g., Table IT). Turnix appears to represent a distinctive 
group of uncertain affinity. Turnix does not appear basal among other neognath 
orders (Fig. 8.5), which is inconsistent with DNA-DNA hybridization findings. 
Based on 12S rDNA transversions and the taxa set in Fig. 8.6, nine additional steps 
are required for the shortest tree uniting Turnix and Gallus as sisters, and four addi- 
tional steps are required to unite Turnix and Fulica as sisters. 

Flamingoes (Phoenicopteridae) present a mosaic of morphological features seen 
in disparate orders including webbed feet, lamellate bills, long legs, and long necks. 
Flamingoes have been considered variously as Anseriformes, Ciconiiformes, Char- 
adriiformes, or equally closely related to some subset of these three groups (re- 
viewed in Sibley and Ahlquist, 1990; Olson and Fedducia, 1980). Mitochondrial 
12S rDNA analyses suggest that Phoenicopterus is more closely related to cormorants 
(Phalacrocorax), herons (Nyctanassa), and storks (Mycteria) than to Anseriformes and 
Charadriiformes based on both transversions only, and a transversions:transitions 
weighting ratio of 5:1 (Fig. 8.7). Based on transversions only, three additional steps 
are required to place Scolopax as sister to Phoenicopterus, and to place Aythya as sister 
to Phoenicopterus. Exhaustive searches based on transversions and on a weighting 
ratio of 5:1, having Diomedea, Nyctanassa, and Phalacrocorax excluded, placed Phoen- 
icopterus and Mycteria as sisters, with Scolopax sister to them. The transversion parsi- 
mony support indices and branch lengths are small; however, they represent our 
most conservative set of characters for application to this question. 
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Fulica atra 


Tauraco hartlaubi 


4 SS 
Coccyzus erythopthalmus % 


Opisthocomus hoazin 








Turnix varia 


Scolopax minor 


Pterocles coronatus 





Gallus gallus 





Crocodylus acutus 





FIGURE 8.6 Single most parsimonious tree (163 steps) based on an exhaustive search of mitochon- 
drial 125 rDNA transversions only, indicating phylogenetic position of a buttonquail (Turnix varia) and 
the hoatzin (Opisthocomus hoazin) among traditionally hypothesized relatives. Numbers on branches de- 
note branch lengths/support indices. Analysis using a transversions:transitions weighting ratio of 5:1 
yielded a single most parsimonious tree (1257 steps) with the same topology as above, except Pterocles is 
placed basal to Turnix. 
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Nyctanassa violacea 
Diomedea nigripes 
Phoenicopterus ruber 
Phalacrocorax pelagicus 
Mycteria americana 


Scolopax minor 





Aythya americana 














Crocodylus acutus 


FIGURE 8.7 Phylogenetic position of a flamingo (Phoenicopterus ruber) relative to disparate taxa tradi- 
tionally proposed as relatives, based on exhaustive parsimony searches of mt 12S rDNA (A) transversions 
only (126 steps), and (B) a transversions:transitions weight ratio of 5: 1 (consensus of two trees requiring 
1011 steps). Numbers in (A) denote branch lengths/support indices. 


D. Relationships among Passeriformes 


Our study taxa include seven passeriform species, and hypothesized relationships 
among them based on mt 12S rDNA (Fig. 8.5) are consistent with many previous 
analyses, showing an early divergence between suboscines (represented by Sayornis) 
and oscines and a basal divergence for Lanius relative to the other study oscines. 
Placement of Motacilla and Junco as sisters is unexpected, as Junco and Cardinalis are 
considered closer based on morphology and are members of the New World nine- 
primaried oscine family Emberizidae (Paynter, 1970) or subfamily Emberizinae 
(Sibley and Ahlquist, 1990). Motacilla, an Old World oscine in the family Motacil- 
lidae (Mayr and Greenway, 1960) or Passeridae (Sibley and Ahlquist, 1990), shows 
a significantly faster rate of mt 12S rDNA evolution compared to five of the six 
other oscines (comparisons not shown), and this may have influenced phylogenetic 
placement. A sister relationship between Motacilla and Vidua would be consistent 
with DNA-DNA hybridization analyses and existing classifications (Sibley and 
Ahlquist, 1990). 


E. Relationships among Anseriformes 
and Galliformes 


Our 12S rDNA analyses indicate monophyly for Galliformes and Anseriformes 
based on the set of 20 taxa analyzed (Fig. 8.8). The phylogenetic hypothesis shows 
sister relationships for swans and geese, and for swans and geese with an unresolved 
set of five diverse ducks (from the traditional subfamily Anatinae). Sister to all of 


2 Cygnus buccinator 
7 Cygnus atratus 
Anser rossi 
Branta sandvicensis 
Anas platyrhynchos 
Anas formosa 
Aix sponsa 
Aythya americana 
Somateria fisheri 
14 Dendrocygna arcuata 
1 9 Dendrocygna bicolor 
Thalassornis leuconotus 
16 Anhima cornuta 
3 Chauna chaveria 
Anseranas semipalmata 
Gallus gallus 
Coturnix coturnix 
3 Bonasa umbellus 
Phasianus colchicus 


Meleagris gallopavo 


Crypturellus undulatus 


Rhea americana 


Struthio camelus 


FIGURE 8.8 Strict consensus of four equally parsimonious trees of 625 steps based on equal weights 
for all mt 125 rDNA characters for 20 anseriform and galliform species, using 3 paleognaths as an out- 
group. Random addition sequences for taxa were used in 1000 replicate searches. Numbers on branches 
denote support indices. Analysis using a transversions:transitions weighting of 5:1 (not shown) yielded a 
strict consensus of 5 trees requiring 1149 steps with the same topology shown, except that the only node 
resolved within galliforms was a sister relationship for Bonasa and Phasianus. 
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these is a clade including two whistling ducks (Dendrocygna) and the white-backed 
duck (Thalassornis), and sister to all the above anseriforms is a clade including two 
screamers (Anhima, Chauna; family Anhimidae) and the magpie goose (Anseranas). 
The whistling ducks have traditionally been considered sister to the geese and swans 
within the subfamily Anserinae (Delacour and Mayr, 1945; Sibley and Ahlquist, 
1990); however, our analysis suggests a basal position to both geese and swans and 
the Anatinae, as found by Livezey (1986). Thalassornis was traditionally considered 
as one of the stiff-tailed ducks (tribe Oxyurini; Delacour and Mayr, 1945). Be- 
havioral and morphological characters were later identified that indicated a closer 
relationship between Thalassornis and whistling ducks ( Johnsgard, 1967; Livezey, 
1986), although they were still not considered as sister taxa. Our 12S analyses do 
place Thalassornis and whistling ducks as sisters. Anseranas has traditionally been 
considered as within the Anatidae (e.g., Delacour and Mayr, 1945; Sibley and Ahl- 
quist, 1972), although Verheyen (1955) postulated a sister relationship for scream- 
ers and Anseranas based on skeletal morphology as we have found, and as indicated 
by DNA-DNA hybridization analyses of Sibley and Ahlquist (1990). DNA-DNA 
hybridization analyses of Madsen et al. (1988) and morphological analyses of Livezey 
(1986), however, did not support this sister relationship. 

12S rDNA analyses suggest that Meleagris is sister to a clade including the other 
four galliform species, in which Gallus and Coturnix are sisters and Bonasa and 
Phasianus are sisters (Fig. 8.8). A basal divergence for Meleagris within the group is 
consistent with morphological classifications (e.g., Wetmore, 1960), DNA-DNA 
hybridization (Sibley and Ahlquist, 1990), and mt cytochrome b (cytb) analyses 
(Kornegay et al., 1993). Placement of Bonasa is unexpected given its apparent close 
relationship to Meleagris in other analyses (Sibley and Ahlquist, 1990; Ellsworth 
et al., 1996). An exhaustive search including the five phasianid species and Cygnus 
buccinator and Aythya americana as an outgroup, using all characters, yielded two 
equally parsimonious trees, one identical to the topology in Fig. 8.8, and the other 
with Meleagris sister to a Bonasa, Phasianus clade. 


F. Relationships among Falconiformes 
and Strigiformes 


Phylogenetic relationships among the primary groups of predatory birds {including 
owls, hawks, eagles, Old and New World vultures, falcons, caracaras, and the sec- 
retary bird (Sagittarius serpentarius)| have been controversial. Morphological analyses 
have led some to suggest that certain of these primary groups (particularly owls, 
falcons, hawks, and secretary bird) are no more closely related to each other than 
to various other orders of birds (Hudson, 1948; Jollie, 1976-1977). Strigiformes 
(owls) have been considered closely related to the Falconiformes (hawks, eagles, 
vultures, falcons, caracaras, and secretary bird) by a few researchers (1.е., Reichenow, 
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1913-1914; Cracraft, 1981), but only distantly related by most others (Chandler, 
1916; Wetmore, 1960; Brown and Amadon, 1968; Sibley and Ahlquist, 1990; Grif- 
fiths, 1994). Correspondingly, behavioral similarities between falcons and owls, 
such as an absence of nest-building, killing of prey by severing neck vertebrae, and 
holding of food in one claw have been variously considered as either shared derived 
or convergent traits. The cursorial, snake-hunting secretary bird is most often con- 
sidered a specialized accipitrid (hawks and eagles), although character support for 
this relationship is limited. Several researchers have noted superficial similarities be- 
tween secretary bird and cariamids within the Gruiformes (e.g., Mayr and Amadon, 
1951). The osprey (Pandion haliaetus) traditionally comprises the family Pandionidae 
and is placed closest to Accipitridae. Considerable evidence indicates that New 
World vultures are actually Ciconiiformes (e.g., Friedmann, 1950; Ligon, 1967; 
Sibley and Ahlquist, 1990), although this is not supported by analyses of syringeal 
morphology (Griffiths, 1994; see also Seibold and Helbig, 1995). We have no se- 
quence data for New World vultures at present, and cannot address this last issue. 

In their analyses of falconiform lineages, Sibley and Ahlquist (1990; p. 486) say 
the “positions of these [melting] curves are probably due as much to the different 
rates of DNA evolution as they are to times of divergence.” In light of this, their 
phylogenetic placement of these taxa rests entirely on the unspecified “corrections” 
applied, and it is difficult to assess the evidential basis of their hypotheses. 

Our analyses of mt 12S rDNA are inconsistent with traditional configurations 
of Falconiformes and Strigiformes. For analyses using both 1:1 and 5:1 weighting 
ratios of transversions:transitions, Falco and Tyto are sister taxa, joined basally to a 
clade of 11 owls in the family Strigidae (Fig. 8.9). A sister relationship for Strigi- 
formes and Caprimulgiformes (represented by Chordeiles), as suggested by some 
(e.g., Sibley and Ahlquist, 1990) is not supported by the 12S characters. 12S analyses 
do not place secretary bird any closer to the accipitrids or Falco than to taxa repre- 
senting other orders. Nor does 12S mt rDNA support a sister relationship between 
secretary bird and a gruiform (Fulica). Exhaustive parsimony searches (not shown) 
for Fulica, Sagittarius, Buteo, and Falco with Aythya as an outgroup yielded the in- 
group topology: 


(((Buteo, Falco) Sagittarius) Fulica) for all characters and an unresolved trichotomy 
((Buteo, Falco, Sagittarius) Fulica) for weight ratios of 5:1 and 1:0. 


Placement of Sagittarius may be influenced by relatively fast rates of 12S sequence 
change (Table H) and potential long-branch attraction. 

We һауе 518 bases of mt COI available for 22 of the 29 taxa in the above 125 
analyses (Fig. 8.9) and combined them in further parsimony searches. The total 
evidence analysis for 22 ingroup taxa based on equal weighting for all characters 
indicates a sister relationship for Falco and Sagittarius, and an unresolved polytomy 
for those two taxa and the other primary lineages (Fig. 8.10A). Use of unequal 
weighting (5:1 for 12S and codon positions 1 and 2 only for COI) indicates an 
unresolved polytomy for three lineages: Falco, Sagittarius, and accipitrids plus Pan- 
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1:1 5:1 


Sagittarius serpentarius Sagittarius serpentarius 
Gampsonyx swainsonii Tauraco hartlaubi 
Pernis apivorus Chaetura cinereiventris 
Pandion haliaetus Gampsonyx swainsonii 
Circaetus gallicus Pernis apivorus 
Gyps fulvus Pandion haliaetus 
Accipiter superciliosis Gyps fulvus 
Circus aeruginosus Circaetus gallicus 
Milvus migrans Accipiter superciliosis 
Haliaeetus leucocephalus Circus aeruginosus 
Buteo jamaicensis Milvus migrans 
Buteo buteo Haliaeetus leucocephalus 
Falco peregrinus Buteo jamaicensis 
Tyto alba Buteo buteo 
Aegolius acadicus Chordeiles minor 
Ninox philippensis Falco peregrinus 
Asio flammeus Tyto alba 
Nyctea scandiaca Aegolius acadicus 
Bubo virginianus Ninox philippensis 
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FIGURE 8.9 Phylogenetic hypotheses based оп 1000 replicate searches with random addition of taxa 
for 125 mt rDNA for 28 (ingroup) bird species focusing on falconiform and strigiform taxa. Aythya 
americana was the designated outgroup. (A) Strict consensus of 3 equally parsimonious trees of 1119 steps 
based on a transversions:transitions weighting ratio of 1:1. Numbers denote support indices. (B) Strict 
consensus of 2 equally parsimonious trees of 2092 steps, based on a transversions:transitions weighting 
ratio of 5:1. 


dion (Fig. 8.10B). This unites the lineages traditionally included in Falconiformes 
and we consider this our current best estimate of their phylogenetic relationships. 


1. Relationships within Accipitridae 


The 208 species in the cosmopolitan family Accipitridae represent the largest radia- 
tion of diurnal birds of prey. Subgroups that have been recognized based on oste- 
ology, myology, plumage, and behavior include milvine and nonmilvine kites, sea 
eagles, Old World vultures, snake eagles, accipiters, chanting goshawks, harriers, 
booted eagles, buteos (or buzzards), subbuteos, and harpy eagles (Brown and Ama- 
don, 1968). However, phylogenetic relationships among the 60 or so accipitrid 
genera are little known (Amadon, 1982). Jollie (1976-1977) professed an inability 
to identify either derived features or character transformation series from his de- 
tailed anatomical studies. Jollie and others have attributed this to the extreme spe- 
cializations found in most avian predators and subsequent difficulty in distinguish- 
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FIGURE 8.10 Phylogenetic hypotheses based on combined 12S mt rDNA (859 characters) and mt 
COI sequences (518 characters) for 22 (ingroup) bird species, focusing on falconiform and strigiform 
taxa. Anas platyrhynchos was the designated outgroup. One thousand replicate searches with random 
addition of taxa were conducted for (A) all characters equally weighted (strict consensus of 6 trees re- 
quiring 1908 steps) and (B) 12S characters weighted with a transversions:transitions ratio of 5: 1 and COI 
third-codon positions given a weight of 0 (strict consensus of 4 trees requiring 1930 steps). Numbers on 
branches in (A) denote support indices. 


ing results of common ancestry and convergence. The only previous molecular 
study has been based on DNA-DNA hybridization and included a small number 
of accipitrid groups (Sibley and Ahlquist, 1990). 

We present two analyses focusing on accipitrid relationships using Sagittarius 
and Falco as outgroup taxa in exhaustive searches. The first is based on 12S rDNA 
and includes 11 ingroup taxa (Fig. 8.11), and the second is based on 12S rDNA and 
COI for 10 ingroup taxa (Fig. 8.12). These analyses are largely consistent with those 
based on all falconiform and strigiform taxa (Figs. 8.9 and 8.10) and may be sum- 
marized as follows. The two Buteo species are sisters as expected. A sister relationship 
for Milvus and Haliaeetus is consistent with the hypothesis of close relationship be- 
tween the sea eagles and the milvine kites postulated by others based on aspects 
of morphology and behavior (Brown and Amadon, 1968; Amadon, 1982; Olson, 
1982). We found consistent support for a sister relationship between the Buteos and 
the Milvus/Haliaeetus clade (Figs. 8.9—8.12), suggesting that the milvine kites and 
sea eagles may not be basal among accipitrids as they have been considered to 
be based on morphology and behavior (Amadon, 1982; but see Griffiths, 1994). 
Moving toward the base of the combined data set topology (Fig. 8.10), Circus and 
Accipiter are weakly supported as sisters, followed by a Gyps/Circaetus clade and a 
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FIGURE 8.11 Single most parsimonious tree (475 steps) based on mitochondrial 12S rDNA using a 
transversions:transitions weighting ratio of 1:1 for 11 Accipitridae (ingroup) species with Falco and Sag- 
ittarius as an outgroup. Numbers on branches denote branch lengths /support indices. 


240 D. P. Mindell et al. 


Buteo jamaicensis 
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FIGURE 8.12 Single most parsimonious tree (858 steps) based on mt 12S rDNA and mt COI se- 
quences combined with all characters equally weighted for 10 (ingroup) Accipitridae species with Falco 
and Sagittarius as an outgroup. Numbers on branches denote branch lengths/support indices. 


Pernis/Pandion clade. Based on 12S rDNA characters Gampsonyx appears as either a 
basal unresolved lineage (Fig. 8.9A), together with Pernis and Pandion (Fig. 8.9B), 
or basal among accipitrids and Pandion (Fig. 8.11). 

The kites comprise a diverse set of accipitrids, and suspected primitive features, 
thought to be retained from less predatory ancestors, include mild predatory hab- 
its and reduced sexual dimorphism in many species (Brown and Amadon, 1968). 
Friedmann (1950) divided kites into three groups, represented by Milvus, Pernis, 
and Gampsonyx in our study taxa, which are clearly polyphyletic in our analyses 
(Figs. 8.9—8.12). Analyses of syringeal morphology also indicate polyphyly of the 
kites (Griffiths, 1994). Pandion, the osprey, although generally considered as com- 
prising a separate family (Pandionidae) based on distinctive morphological features, 
1s consistently shown as sister to Pernis (Figs. 8.9—8.12). 

Greater topological similarity of Fig. 8.11 (12S, 1:1) with Fig. 8.9B (12S, 5:1) 
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than with Fig. 8.9A (12S, 1:1) is likely due to reduction of the confounding effects 
of homoplasy in using outgroups that are more closely related to the ingroup taxa. 


2. Relationships within Strigidae 


The Strigidae, or recent owls (Amadon and Bull, 1988), includes two primary 
clades. The barn owls and bay owls (subfamily Tytoninae) comprise one, and all the 
other owls (the "typical owls"; subfamily Striginae) comprise the second. A variety 
of classifications (e.g., Ford, 1967) have been presented for Striginae taxa, although 
few phylogenetic analyses have been conducted. 

Our hypothesized relationships for strigids are identical in Figs. 8.9A and B, and 
8.10B. The only conflicting topology (Fig. 8.10A) involves a different placement 
for Asio which may be influenced by inclusion of rapidly evolving COI third-codon 
positions. We found monophyly of the scops owls (Otus, Mimizuku), a close rela- 
tionship between the scops owls and a Nyctea/Bubo clade, and a sister relationship 
for Aegolius and Ninox, which are, in turn, sister to the others. Tyto is consistently 
placed outside the Striginae taxa. Relationships for the seven Striginae genera com- 
mon to our analyses and ОМА DNA hybridization analyses (Sibley and Ahlquist, 
1990) are congruent with the exception that our analyses show Aegolius and Ninox 
as sisters, rather than unresolved, and Asio as sister to the Nyctea/Bubo clade rather 
than unresolved. 


V. CONCLUSIONS 


Phylogenetic analyses presented here based on mitochondrial DNA characters ad- 
dress a number of controversial issues. Anseriformes and Galliformes are supported 
as sister taxa that are more closely related to a paleognath (Rhea) than to a set of 
neognaths. Placement ofthe root is critical in this latter determination, and addition 
of sequences from more taxa within the avian ingroup and within the crocodilian 
outgroup clade (including alligatorids, crocodilids, and gavialids) may help reduce 
potential attraction among long branches. It is doubtful, however, that greater sam- 
pling of extant forms will ever eliminate the problem entirely. Molecular and mor- 
phological characters from extinct lineages arising from the phylogenetic internode 
between crocodilians and birds could be useful in rooting phylogenies, if any such 
characters become available. Use of duplicated gene sequences as outgroups, where 
duplications occur prior to diversification events, has been demonstrated by Iwabe 
et al. (1989). 'This approach could also be useful in studies of avian phylogeny, 
if gene duplications (or ex-mt nuclear genes) are found that predate divergences 
among extant avian lineages and postdate the split between birds and crocodilians. 
In analyses of two different data sets, Passeriformes are indicated as basal 
(1) among five lineages representing the oldest divergences among extant birds, and 
(2) among a set of neognaths. Basal placement among neognaths uses an avian out- 


242 D. P. Mindell et al. 


group, which helps reduce the potential effect of a long branch introduced with a 
crocodilian outgroup. However, this makes the assumption that Passeriformes are 
not basal to the outgroup Anseriformes. Our rooted phylogeny in Fig. 8.4B based 
on more than 13 kb of mitochondrial sequence is inconsistent with the sequence of 
appearance of fossil forms and with previous molecular analyses of Sibley and 
Ahlquist (1990). It should be remembered, however, that DNA-DNA hybridiza- 
tion distance analyses by Sibley and Ahlquist are entirely unrooted, with the earliest 
divergence among extant birds being based on a form of midpoint rooting and 
unsupported assumptions of evolutionary rate homogeneity. 

Our evidence indicates that a buttonquail is more closely related to a gruiform 
than to a galliform. However, the buttonquail is sister to a cuckoo/hoatzin clade, 
and not sister to the gruiform. Hoatzin appears most closely related to cuckoos (as 
represented by Coccyzus), and not turacos or Galliformes. A flamingo (Phoenicopte- 
rus) appears more closely related to cormorants, herons, and storks than to either 
Anseriformes or Charadriiformes. The magpie goose (Anseranas) is supported as 
being sister to screamers (Anhimidae) rather than sister to the other waterfowl (An- 
atidae). Based on conserved mt 12S rDNA and COI characters combined, we found 
Falconiformes to be monophyletic with a polytomy for three primary lineages: 
Accipitridae species, secretary bird, and a representative of Falconidae. Kites are 
found to be polyphyletic, with osprey (Pandion) being placed within an accipitrid 
clade (as sister to Pernis) rather than sister to all accipitrids. 

Limitations of our analyses of avian phylogeny are common to many others in- 
volving higher level taxa and rapid radiations of forms. Greater sampling of taxa 
within diverse lineages (different orders) of birds is needed, as inclusion of only one 
or a few taxa from distantly related lineages may work to increase prevalence of long 
branches. We suspect that if all orders within our study taxa were as well represented 
as those of Falconiformes and Strigiformes there would be fewer polytomies de- 
noting ordinal relationships and greater stability of topology based on alternative 
character weighting approaches. Despite more complete and more balanced sam- 
pling of taxa, some relationships are likely to remain poorly resolved, particularly 
where time elapsed between divergences is small or where divergences are con- 
comitant. Ideally, systematists would like to find characters that experienced signifi- 
cant change during the radiations, but little thereafter, and such a pattern may occur 
more often for morphological characters associated with selection and cladogenetic 
events (Lanyon, 1988; Olmstead et al., 1990). Collection of additional morphologi- 
cal and molecular characters for use in combined analyses may help in this regard. 
Numerous studies indicate increased resolving power of larger character data sets 
(e.g., Hillis et al., 1994; Charleston et al., 1994; Cummings et al., 1995; Mindell and 
Thacker, 1996). Further analyses of the sequence data presented here are needed, 
using alternative 12S rDNA alignments, and iterative weighting approaches, and 
these are likely to yield some differences in phylogenetic inference. 

Ultimately, enhanced phylogenetic resolution requires comprehensive analyses 
of increased numbers of taxa and characters. The primary challenge for systematists, 
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beyond data collection, is determining how to conduct the comprehensive analyses 
in light of increased understanding of the varied constraints on character evolution. 
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I. INTRODUCTION 


The schism between microevolutionary and macroevolutionary research erected in 
the 1970s is considered by many biologists today to be an artificial one. Whereas 
the pattern of evolution and the relative strength of various evolutionary forces may 
well differ markedly at different hierarchical levels (Gould, 1994), the restriction of 
specific mechanisms to individual levels and the independence of these levels from 
one another is considered by many a nonissue. Evolutionary processes of drift, se- 
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lection, and mutation acting within populations (microevolution) can explain large- 
scale evolutionary trends between species and higher taxa (macroevolution) without 
recourse to additional phenomena (Charlesworth et al., 1982). Most of the early 
debates focused on phenotypic evolution as a forum for thrashing out these issues, 
and molecular evolution likely received far less attention in this context because of 
the ease with which gene trees spanning multiple hierarchical levels can be drawn 
(Avise, 1994). 

While the bridges between microevolutionary and macroevolutionary studies of 
molecular evolution may be more obvious than those for phenotypic evolution, 
they are far from complete. Enormous conceptual shifts in this direction have been 
brought about by efforts to build and interpret large-scale phylogenetic trees of 
DNA sequences and alleles within species (Avise et al., 1987; Cann et al., 1987; 
Vigilant et al., 1991; Edwards, 1993a; Baker et al., 1993; Bowen et al., 1994), the 
advent of "genealogical" models in population genetics (Slatkin and Maddison, 
1989, 1990; Hudson, 1991), and the inculcation of "tree thinking" and the com- 
parative method, both above the species level (Felsenstein, 1985; O'Hara, 1988; 
Harvey and Pagel, 1991; Brooks and McLennan, 1990) and below (O'Hara, 1993; 
Edwards and Kot, 1995). But Felsenstein's (1988, p. 445) statement that "systema- 
tists and evolutionary geneticists don't often talk to each other" was both retrospec- 
tive and prospective, and the problem is particularly acute in molecular evolutionary 
ornithology. Although much of the conceptual framework that justifies ignoring 
the species boundary in molecular evolution has been laid, the flood of sequence 
data in higher level avian systematics has not been accompanied by an equivalently 
enthusiastic concern for the population genetic bases underlying the patterns of 
sequence variability observed among higher taxa. To the extent that this dialogue 
between levels in the hierarchy is not pursued, the potential richness of the inter- 
actions between levels will not be realized. 

This chapter reviews studies of molecular evolution in birds that illustrate how 
molecular and population phenomena observable within species can affect both the 
analysis and outcome of patterns observed when comparing sequences from repre- 
sentatives of higher taxa. This theme will be explored with examples from five 
avenues of current research by this author and others: (1) the implications of intra- 
specific polymorphism for higher level molecular systematics, (2) the application of 
patterns of nucleotide substitution inferred from lower level comparisons to phy- 
logenetic analyses of higher taxa, (3) the effect of incomplete lineage sorting and 
hybridization gene trees among higher taxa, (4) the effect of population structure 
on the dating of vicariant biogeographic splits, and (5) the impact of selection oc- 
curring within species on patterns of diversity and phylogenetic analysis of long- 
diverged sequences. One may conclude that molecular evolutionary studies of birds 
below the species level have more than passing relevance for higher level systematics 
and that useful insights can be gained by simultaneous analysis at both levels. 
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П. MOLECULAR VARIABILITY 
A. The Specter of Polymorphism 


Intraspecific polymorphism at the molecular level is a ghostly specter, an ever- 
present, inescapable shadow looming over the shoulder of the higher level systema- 
tist. For either sequence or allozyme data, the problem has been raised repeatedly 
and is greatest when the variability of characters is observable only on examination 
of taxonomic levels lower than the sampling scheme employed. Characters that 
may appear to unite members of particular clades may in fact represent parallelisms 
or convergences on denser sampling. For DNA sequence data, sites that appear to 
serve as synapomorphies linking clusters of taxa may in fact be polymorphic within 
taxa, or undergo further change, at the tips of these clusters, rendering their actual 
use as synapomorphies, or their inferred number of changes on the tree, dubious 
(Fig. 9.1). In this context, "polymorphism" is a problem at any level in the hier- 
archy, within or between species, so long as it occurs at a level lower than that being 
analyzed. 

Intraspecific polymorphism in DNA sequences can have two sources. One is 
when alleles in an ancestral species do not sort completely between the time of 
population splitting and the time of sampling; here, even in the absence of mutation, 
there is a possibility of reaching incorrect phylogenetic conclusions when a single 
allele from a species is sampled (see Section II,B). The other source is mutations 
arising within species whose allelic lineages have sorted completely. The widespread 
use of mitochondrial DNA (mtDNA) in higher level avian systematics has positive 
and negative attributes with respect to these sources of polymorphism. Because the 
effective population size of mtDNA is usually about one-quarter that of a nuclear 
gene (barring large deviations from equal sex ratios and high rates of paternal leak- 
age), the problem of incomplete lineage sorting prior to sampling is minimized 
compared to an average nuclear gene (Moore, 1995). However, while the high rate 
of substitution in mtDNA coding and noncoding sequences renders them useful at 
a variety of hierarchical levels, the possibility of ignoring unseen polymorphism due 
to mutations is great, especially if the time between population splitting and sam- 
pling is long, i.e., even long after reciprocal monophyly of descendent lineages has 
been achieved. (Reciprocal monophyly for a given locus refers to the condition in 
which all allelic lineages within each of two species descending from a single ances- 
tral gene pool form monophyletic groups.) This latter situation characterizes studies 
of cytochrome b in babblers of the Australo—Papuan songbird genus Pomatostomus. 
Sequencing of a 282-bp segment of this gene revealed 17 variable sites among 
16 individuals from the 2 lineages (eastern and western, gray-crowned and red- 
breasted, respectively) within the gray-crowned babbler (Pomatostomus temporalis; 
Edwards and Wilson, 1990). Phylogenetic analysis, however showed that the 
mtDNAs within either lineage were monophyletic, i.e., stemmed from a com- 
mon ancestor within those lineages. Thus, whatever polymorphism is missed in 
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FIGURE 9.1 Examples from two sites in the mitochondrial cytochrome b gene illustrating the dis- 
covery of hidden polymorphism on denser taxonomic sampling. Illustrated here is the reconstruction of 
variability in two sites (positions 18 and 237 of the 282-bp cytochrome b segment sequenced in Edwards 
and Wilson, 1990) along a phylogenetic tree consisting of 14 passerine birds (left; Edwards et al., 1991; 
with reanalysis in Edwards and Arctander, 1996) and a second tree (right) consisting of the above taxa 
plus 17 additional sequences within babblers (Pomatostomus). Common names of babblers are as follows: 
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comparisons of single individuals at this or higher levels likely stems from recurrent 
mutation, not incomplete sorting (Fig. 9.1). 

We can estimate the magnitude of the problem of polymorphism by measuring 
changes in the phylogenetic signal brought about by changing the level of taxon 
sampling in phylogenetic analysis (Table I). For example, for 3 babblers, 10 other 
perching birds (Passeriformes), and a woodpecker, Edwards et al. (1991) compared 
cytochrome b sequences (the "exemplar tree") that spanned those determined 
within each of the babbler species earlier (the “babbler tree"). Despite the larger 
number of sequences in the babbler tree (Table I) the phylogenetic information in 
the sequences at this level, as measured by consistency and retention indexes, in- 
creases. By contrast, the strength of the branch leading to the babblers drops slightly 
on more intense sampling at low taxonomic levels (exemplar tree vs full tree, 
Table I). Thus, in this example the phylogenetic level of sampling as well as the 
number of sequences used can influence the perceived phylogenetic signal in the 
data (cf. Sanderson and Donoghue, 1989). Although the number of characters in- 
ferred to have changed unambiguously on this branch remains the same, the list of 
characters changes (Table I): three of these sites are polymorphic either within the 
gray-crowned babbler or among the five babbler species (Edwards and Wilson, 
1990; Table I) and are interpreted as convergences or homoplasy only on denser 
taxon sampling. This effect reflects the fact that the sampling of taxa can influence 
the inferred reconstruction of events by parsimony (Wilson et al., 1991). 

We can further use the combination of low taxonomic level and high taxonomic 
level data sets to test the adequacy of certain methods of correcting for unseen 
substitutions due to unsampled nodes in a phylogeny. Fitch and Bruschi (1987) 
pointed out that there is a positive correlation between the number of nodes passed 
through from tips of a tree to the ancestor of the ingroup (“penultimate ancestor") 
and the number of substitutions inferred by parsimony. Such a correlation, which 
is also evident in the cytochrome b data for passerines (Fig. 9.2A), suggests that 
inferred substitutions are being missed along lineages with fewer branches. The 
method they propose for correcting branch lengths for such unseen substitutions 
leaves the lengths of those lineages passing through the largest number of nodes in 
the tree (the "trunk" of the tree) uncorrected. We can test the adequacy of this 
aspect of their method by asking how the branch lengths of such lineages change 
when we add the 17 additional babbler sequences to the exemplar tree. The increase 
in branch length inferred by parsimony in the full tree (Fig. 9.2B; Table 1) suggests 
that, as Fitch and Bruschi (1987) suggest, even the lengths of those lineages passing 
through the largest number of nodes are likely underestimated by parsimony. In 
summary, although noise in the higher level phylogenetic analysis was increased by 





С Babbler, gray-crowned (temporalis); C Babbler, chestnut-crowned (ruficeps); Н Babbler, Hall's (halli); 
W Babbler, white-browed (superciliosus); R Babbler, rufous (isidori). The two sets of trees illustrate the 
discovery of nucleotide polymorphism (both sites) and nucleotide convergence (site 237) at lower taxo- 
nomic levels, and the extra steps revealed by denser sampling. Both sites are third positions of codons. 


TABLEI Perceived Information Content and Substitution Dynamics of Perching Bird 
Mitochondrial Cytochrome b Sequences under Different Intensities of Taxon Sampling’ 





Parameter Babbler tree Exemplar tree Full tree 
Taxonomic level of sequence comparison Within Between Within genus, 
genus families between 
families 
Number of sequences 20 14 34 
Tree length 106 373 429 
Perceived information content 
Consistency index 0.61 0.52 0.46 
Retention index 0.83 0.30 0.61 
gı statistic —0.78 —0.58 —0.51 
Support for branch leading to babbler 
sequences 
Number of unambiguous sites sup- — 5 (sites 18, 49, 5 (sites 49, 105, 
porting branch* 105, 111, 111, 237, 
237) 279) 
Bootstrap value of branch — 96 88 
Number of inferred unambiguous 
changes within babbler clade 106 40 63 
Transition/transversion ratio 
Parsimony 14.14 1.34 1.69 
Maximum likelihood: 29.1 3.0 3.8 


Variation in substitution rate among sites 


(Wakeley, 19933) 


Mean number of steps per site 0.38 1.32 1.52 
Variance in number of steps per site 0.67 3.46 4.53 
f value 5:2 19.1 28.7 

a 0.48 0.82 0.77 





“The “exemplar” tree consists of 12 perching bird sequences from Edwards et al. (1991) and using the 
thrush (Catharus guttatus) sequence from Helm-Bychowski and Cracraft (1993) as analyzed in Edwards 
and Arctander (1996). The “full” tree consists of the above tree plus 17 additional sequences from within 
the "babbler" genus Pomatostomus, whose phylogenetic relationships are presented in Edwards and Wil- 
son (1990). The babbler tree consists of the latter 17 sequences only. All analyses were performed in 
MacClade (version 3.0; Maddison and Maddison, 1992) and PAUP (version 3.0s; Swofford, 1991). 

"Site numbers correspond to those listed for the 282-bp segment in Edwards and Wilson (1990), 
beginning with 1 for the first site. 

"The maximum likelihood estimate of the transition/transversion ratio was obtained using the pro- 
gram NUCML in the phylogenetics package MOLPHY by Adachi and Hasegawa (1995). 

“Test statistic for rate nonuniformity based on the mean and variance of the number of parsimony 
steps per site (Wakeley, 1993). 

“Inverse of the coefficient of variation of the substitution rate among sites. Smaller values of @ indicate 
greater perceived variation in rate among sites relative to larger values. 
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FIGURE 9.2 Effect of number of nodes and taxonomic sampling on the inferred number of changes 
along lineages in trees for passerine cytochrome b sequences. (A) Fitch-Bruschi (1987) plot illustating 
the dependence of number of inferred parsimony steps on the number of nodes passed through from 
root of the perching bird tree (Edwards et al., 1991; Edwards and Arctander, 1995). The slope of the line 
is 6.5. (B) Capture of extra parsimony steps along four branches in the babbler portion of the perching 
bird tree on denser taxonomic sampling. The "exemplar" tree represents the 14 sequences in Edwards 
et al. (1991); the "full" tree represents thesc plus 17 additional babbler sequences (Edwards and Wilson, 
1990). The diagonal line represents no change in number of steps along these branches in the two trees. 


including the polymorphism found in the low taxonomic level sequences, the par- 
simony reconstructions are more realistic. More importantly, what appear in the 
exemplar tree to be sites clearly delineating a particular lineage turn out to be noisier 
than this coarse sampling scheme would suggest (Fig. 9.1; Table I). 

In practice, the problem of polymorphism is usually addressed by ignoring or 
downweighting those sites or types of nucleotide change that are likely to be poly- 
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morphic at lower levels. Several higher level systematic analyses of avian DNA se- 
quence data have dropped variability in the third positions of codons (Edwards et al., 
1991) or transition changes in all positions (Helm-Bychowski and Cracraft, 1993). 
Alternatively, step matrices, in which there is an increased cost in number of steps 
for particular substitution types (e.g., transversions), can be utilized. When such an 
approach is taken, the increase in consistency index at lower phylogenetic levels 
disappears, making these results more consistent with the findings of Sanderson 
and Donoghue (1989). Implementation of all of these methods entails an a priori 
model, as in the maximum likelihood method of phylogenetic inference (Felsen- 
stein, 1981). Maximum likelihood essentially accounts for polymorphism and sites 
known to change frequently by downweighting the contribution of these sites to 
the total likelihood. As outlined in Section ILB, these methods of dealing with 
unseen polymorphism are connected to another type of reliance on data from lower 
taxonomic levels—inferring the actual pattern of nucleotide substitution. 


B. Pattern of Nucleotide Substitution 


It was not until the polymerase chain reaction (PCR) was used to obtain multiple 
closely related sequences that the high transition bias originally observed in primate 
mitochondrial DNA (Brown et al., 1982) was confirmed for birds (Kocher et al., 
1989; Edwards and Wilson, 1990). This observation, which has had important con- 
sequences for the analysis of sequence data in higher level avian systematics (Helm- 
Bychowski and Cracraft, 1993; Lanyon and Hall, 1994; see also Cracraft and 
Helm-Bychowski, 1991), was determined almost wholly by observing patterns of 
nucleotide change among close relatives (sequences between which there have been 
few if any multiple changes at single sites). 

That the transition bias in avian mtDNA is best observed among close relatives 
can be illustrated again with the cytochrome b data from babblers and other passer- 
ines (Table I). Edwards and Wilson (1990) observed that the most closely related 
sequences within babbler species differed from one another solely by transition 
changes, and the skew toward C/T changes was high in most comparisons; a maxi- 
mum likelihood method yields a ratio as high as 29 (Table I). By contrast, were our 
knowledge of avian mtDNA dynamics drawn solely from comparisons of distantly 
related sequences, the bias would appear much less extreme (Table I). Parsimony 
reconstructions along these same trees confirm this trend (Table I). 

It is becoming increasingly clear that some sort of weighting of rarely occur- 
ring substitution types (e.g., transversions) can considerably improve phylogenetic 
accuracy of mtDNA analyses (sensu Mindell and Honeycutt, 1990; Cracraft and 
Helm-Bychowski, 1991; Hillis et al., 1993; Miyamoto et al., 1994). An elegant way 
of visualizing the dynamics of substitution for use in weighting schemes is simply to 
plot observed numbers of transitions on numbers of transversions for pairs of se- 
quences (Hasegawa et al., 1985); here transversions serve as an approximate time 
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scale along which the dynamics of transitions can be plotted, and the correlation 
between the axes of the plot is minimized.! Edwards and Wilson (1990) used this 
approach to estimate the transition bias in cytochrome b among close relatives; a 
crude estimate of the bias at this level (about 20:1, primarily in third positions of 
codons) was then assumed a priori to apply to the set of distantly related sequences 
later analyzed (Edwards et al., 1991). The appeal of this approach is that the model 
of nucleotide substitution assumed in the analysis of diverged sequences 15 based on 
real data taken from a subset of the taxa under consideration; other approaches 
either assume an arbitrary model of substitution or one based on unrelated species. 
Since the transition bias in animal mtDNA likely differs between major taxonomic 
groups, the former approach may not make use of all available information, whereas 
the latter approach may be somewhat misleading. 

Sequences for multiple gene regions determined for the same set of close relatives 
can then be used to compare the dynamics of substitution among those regions. 
Figure 9.3 plots the dynamics of substitutions among 27 babbler mtDNAs for por- 
tions of 2 regions: cytochrome b and the control region (Edwards, 1992). The com- 
parison reveals that the approach to saturation for sites in region I (Fig. 9.3A) is 
much slower and less steep than that for third positions of cytochrome 6 (Fig. 9.3B). 
The higher rate of substitution and transition bias in third positions is also suggested 
by a maximum likelihood analysis of base substitutions (Table П; Hasegawa et al., 
1991; Edwards, 1992), although the standard errors of most of the estimates in this 
analysis are quite large owing to short DNA sequences. This initial result, however, 
is surprising, as.one might expect some parts of the control region, a noncoding 
region not subject to the same constraints of coding sequences, to reflect more 
faithfully the underlying mutational bias toward transitions (Brown et al., 1982; 
Thomas and Beckenbach, 1989; Quinn and Wilson, 1993). The difference in base 
compositional bias between the two regions (Fig. 9.3C and D) suggests a similar 
pattern: the more even base composition of region I appears not to faithfully reflect 
the composition expected on the basis of the directional mutation pressure on the 
L strand for chordate mtDNA (Jermiin et al., 1995). The fact that the segment 
termed here “region I" contains at its 3’ end 45 bp of regulatory sequences (includ- 
ing the F box; Southern et al., 1988) may partly explain this pattern. On the other 
hand, one could argue that the conservative evolution of cytochrome b at the amino 


! Authors have applied a number of graphical methods for visualizing the magnitude ofthe transition 
bias, and hence the model to be employed in a phylogenetic weighting scheme for vertebrate mtDNA. 
The problem to be overcome is that the numbers of observed transitions and transversions between two 
sequences are not independent of one another, and that visualization of a change in observed bias over 
time requires an estimate of time when there is none. One common method plots the observed transition 
bias between pairs of sequences on the y axis and an estimate of corrected divergence on the x axis (for 
nonavian examples see Moritz et al., 1992). Although this method appears to fulfill the second criterion 
by using distance as a proxy for time, it actually introduces even more correlations between the axes of 
the plot, since the estimate of divergence depends on the numbers of both transitions and transversions. 
The simpler graphical method of Hasegawa et al. (1985) accomplishes the same goal with less conflation 
of the axes. 
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FIGURE 9.3 Dynamics of base substitution and base composition in region І of the control region 
and cytochrome b from 27 babbler sequences (Edwards and Wilson, 1990; Edwards, 1992). (A and 
B) Plots of observed numbers of transitions (S) per site (и) and transversions (V ) per site for all pairwise 
comparisons of region I and cytochrome 6 sequences, respectively. (C and D) Observed base composi- 
tions for region I and third positions of cytochrome b, respectively. 


acid level (Meyer, 1994) permits few transversions in the third positions, and that 
the mutational bias at these sites is actually much lower. The sequencing strategy 
suggested by this analysis, i.e., determining third positions instead of region I, is not 
only cumbersome but might also result in a data set with reduced signal even within 
species (Edwards, 1992). Furthermore, these comparisons could be confounded 
with differences in the extent of among-site rate variation between the two regions 
(Wakeley, 1993b). 

In theory, a single point sample taken at any point along a unique transition/ 
transversion curve (i.e., determined by comparing close or distant relatives) would 
appear sufficient to estimate the transition bias; in practice, however, curves corre- 
sponding to different biases are most distinct at small to intermediate distances, im- 
plying that comparisons at these levels will be most fruitful (Wakeley, 1993b). De- 
termining the pattern of nucleotide substitution is difficult since the trees on which 
substitutions are traced are themselves based on some model of change, although 
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ТАВІЕ П Maximum Likelihood Estimates of Rates of Change (and 
Standard Errors) in Region I and Third Positions of Cytochrome b for 
20 Pairs of Sequences from Babblers (Pormatostomus)" 


Parameter А 
Estimates based оп two subsets of data 





being 
estimated Region 1 Third positions 

f 1 0.41 1 0.87 

AIC 110.84 106.08 52.96 53.69 
0.055 0.364 2.37 5.08 

а 0.006) (0.110) (2.59) (12.65) 
0.006 0.021 0.006 0.007 

p 0.001) (0.004) (0.003) (0.003) 
0.012 0.079 0.35 0.75 

v, (Myr ^?) 0.001) (0.024) (0.38) (1.87) 
0.003 0.010 0.003 0.003 

v, (Myr?) 0.001) (0.002) (0.001) (0.002) 

v/v, 4 7:9 116.7 250 
0.015 0.089 0.35 0.76 

v (0.002) (0.025) (0.38) (1.87) 


“The maximum likelihood method of Hasegawa et al. (1990) was used to cal- 
culate all values. This method is based on the tree of the sequences and incorpo- 
rates information on base composition into the model. f, Fraction of variable sites 
assumed; AIC, Akaike information criterion, a measure of the explanatory power 
of a model given the number of parameters in the model; а and В, parameters 
determining transition and transversion rates, v, and v,, respectively. To obtain 
absolute rates per million years a divergence time of 9 million years ago (MYA) 
for P. isidori from the other species was assumed (Sibley and Ahlquist, 1985). 


some recently proposed methods appear to yield results that are less sensitive to tree 
topology (Yang et al., 1994). Either way, the empirical estimation of patterns of 
nucleotide substitution depends critically on the range of degrees of sequence di- 
vergence employed, perhaps more so than on the particular method of estimation 
method used (Table I). In this regard it is unfortunate there are not more data sets 
in birds consisting of sequences from protein-coding genes sampled from within 
species or between recently diverged species (Edwards and Wilson, 1990; Birt- 
Friesen et al., 1992; Moum and Johansen, 1992). Since these regions are among the 
more popular choices for use in higher level avian systematics, knowledge of their 
dynamics gleaned from analysis at lower levels can be applied in additional contexts 
(Edwards et al., 1991; Helm-Bychowski and Cracraft, 1993). Most studies at or 
below the species level to date, however, make use of gene regions that would not 
be utilized in higher level systematics (e.g., the control region; e.g., Quinn, 1992; 
Wenink et al., 1993, 1994). 
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Just as in the estimation of the transition bias, the extent to which different sites 
in a sequence change at different rates is also best observed among close or inter- 
mediate relatives (Table I). As sequences become more diverged the increase in 
number of saturated sites (reflected in the increase in mean number of parsimony 
changes) obscures the variability in rates among sites. Although in practice it is 
difficult to distinguish transition bias from among-site rate variation (Wakeley, 
1993b), this again underscores the potential of lower level studies to yield informa- 
tion of critical importance to the interpretation and analysis of highly diverged 
sequences. 


III. POPULATION PROCESSES 
A. Gene and Species Trees 


Sections IIA and B dealt with the enhanced ability to detect and measure mutational 
processes from analyses at lower taxonomic levels. In these cases, phylogenetic 
analysis of data sampled at low taxonomic levels can improve the phylogenetic ac- 
curacy (sensu Hillis and Bull, 1993) of analyses performed at higher levels. Sections 
IILA and B describe cases in which the trees inferred from DNA sequences are 
assumed to be completely accurate; here, lower level processes better inform analy- 
ses at higher levels not by improving our ability to recover the true tree, but by 
drawing attention to processes affecting the concordance between gene trees and 
trees of higher taxa. 


1. Neutrality 


Although the assumption of neutrality of DNA sequences is a prerequisite for using 
many new models appropriate for population-level processes (such as for gene flow 
and genetic drift; Slatkin and Maddison, 1989; Hudson 1991), this assumption is 
rarely tested in more recent avian studies (but see Barrowclough et al., 1985). Neu- 
trality tests have been applied to the noncoding mitochondrial region I data from 
Pomatostomus babblers (Edwards, 1993a,b). Using Tajima’s (1989) test, selective neu- 
trality was not rejected for sequence variability observed in 11 of 12 babbler popu- 
lations (Fig. 9.4). Neutrality could also not be rejected for control region variation 
in Eurasian finches (Marshall and Baker, 1997). These results augur well for future 
studies in birds aimed at using neutral mtDNA sequences to infer population 
histories. 


2. Incomplete Lineage Sorting 


The possibility of discordances between the trees of alleles sampled from species 
and the historical sequence of separation of those species is well known (Nei, 
1987). This discordance occurs not because gene trees have been incorrectly recon- 
structed, but because ancestral populations leave their imprint on descendent popu- 
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FIGURE 9.4 Tests of neutrality of mtDNA region I sequences in 12 populations of gray-crowned 
babblers throughout Australia and New Guinea. Letters denote individual populations as described in 
Edwards (1993a,b). Tajima's (1989) test compares the estimate of 0 = 4№и provided by the number of 
variable sites (s; x axis) and the number of pairwise differences (k; y axis) for sequences sampled from a 
given population. When the estimate of Ó from these two sources differs, selection is a possible interpre- 
tation. Thick lines indicate approximate 95% confidence limits on the joint values of s and k under 
neutrality for a sample size of 15. 


lations for an extended time after population separation. The completion of “line- 
age sorting," the genealogical term applied to simple genetic drift, comes about 
when the genetic lineages of two descendent populations trace back to ancestors 
within each population; the average time required for this process is АМ, genera- 
tions, where N, is the effective population size of the gene or organelle in question 
(Neigel and Avise, 1986). When lineages have not completely sorted, as will occur 
with high probability if the time since separation is short (<<4N generations), there 
is a substantial probability that (1) if single alleles are chosen to represent each popu- 
lations, the tree relating them to one another will not reflect the tree of popula- 
tions; or (2) if multiple alleles per population are sampled, the resulting allele tree 
will be scrambled, with little evidence for monophyly of lineages within popula- 
tions (Fig. 5A). The problem with inferring recent historical scenarios of this sort 
is that gene flow (migration) between long-separated populations can produce gene 
trees that mimic exactly those produced by incomplete sorting (Takahata and Slat- 
kin, 1990). 

The chances of inferring the population history correctly from the gene tree 
increase as the ratio of time between population splits (t) and effective population 
size (N,) of the gene in question increases. Moore (1995) has reminded systematists 
that the relatively small effective population size of mitochondrial DNA makes it a 
much more likely candidate for tracking the population or species tree than an 
average nuclear gene. In fact, Moore (1995) suggests that it would take 16 nuclear 
genes to provide the same level of confidence of concordance between gene and 
species trees as the single locus provided by mtDNA! This conclusion does not bode 
well for ornithologists interested in improving the picture of population history 
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FIGURE 9.5 Summary and permanence of demographic effects creating discordances between gene 
and species trees. (A) Discordance between gene and species trees created by incomplete lineage sorting. 
Thick lines represent allelic lineages whose phylogeny is discordant with that of the three populations 
shown with dashes. Dots at tips represent ancestors of subsequent lineages in (B) and (C). Arrows indicate 
that these alleles are eventually fixed in their respective populations. (B) Shape of the gene tree in (A) 
soon after population splitting. Dots indicate ancestors that have given rise to new lineages. (C) Expected 
shape of discordant gene tree after much elapsed time. The tree is still discordant with the species tree 
but the short internode relative to the terminal branches makes resolution of an observed trichotomy 
difficult. (D) Effect of hybridization on gene trees. Letters represent localities or populations, shapes 
represent major allelic types. Curved arrow represents flow of an allele from population C into popula- 
tion A. (E) Observed gene tree immediately after hybridization. Arrow indicates eventual fixation of the 
new allele in (A) in that population. (Е) Shape of gene tree after much time has elapsed from (D). Shaded 
triangles indicate divergence of allelic lineages from an ancestral unshaded state (D). The gene tree at this 
stage is still discordant with the species tree, even though all lineages have sorted in their respective 
populations. 


with variable nuclear loci (such as microsatellites) when mtDNA lineages appear 
not to have sorted completely; although high variability will always aid in tracing 
lineages, for any given time frame, it is small N, for a locus, not high mutation rates, 
that helps guarantee concordance of gene and species trees. 

Population samples of control region sequences from the gray-crowned babblers 
provide a way to test models consistent with the observation of incomplete lineage 
sorting. Edwards (1993b) sequenced 400 bp of region I from 44 babblers sampled 
from 3 geographically close localities in the Northern Territory, Australia. Phylo- 
genetic analyses suggested that none of the populations were monophyletic with 
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™ Cobourg Peninsula 

e Darwin 
FIGURE 9.6  Fitch-Margoliash tree of 22 types of region I sequences from 3 populations of babblers 
(Pomatostomus temporalis) from the Northern Territory, Australia (Edwards, 1993a). Sequences sampled 
from the Melville Island, Cobourg Peninsula, and Darwin populations are represented by triangles, 
squares, and circles, respectively. The arrows indicate the first occurrence of an interpopulational coales- 
cent event in the sample (1, between lineages from Melville Island and Darwin) and the second occur- 
rence (2, between Cobourg Peninsula and Darwin). The eastern form of P. temporalis was used as an 


outgroup (see Edwards, 1993b). 


respect to the sampled mitochondrial lineages (Edwards, 1993b; Fig. 9.6). Times of 
most recent Pleistocene sea level rises were used as proxies for times of population 
spliting (about 8000—10,000 years ago) and suggested that a historical model of 
genetic drift in the absence of gene flow between the isolated populations could be 
ruled out if the long-term №, of the populations was less than about 13,000. Because 
this value seemed large for a passerine bird, particularly a social one, it seemed un- 
likely that the lack of monophyly of lineages within populations was due to persis- 
tence of alleles in large populations, and statistical tests of neutrality (Tajima, 1989) 
ruled out some sort of balancing selection. Rather, migration between populations 
across water barriers (approximately 50-150 km wide) seemed a better explanation 
of the data, even though this scenario might seem unlikely for a sedentary species. 
However, Milligan et al. (1994), using a completely different approach (Kuhner 
et al., 1995), estimated from these same data that the №, could have been much 
larger. Ifthe gene trees are reasonably accurate, the possibility exists that the pattern 
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is due to incomplete lineage sorting in isolated populations. If this is the case, we 
can still use the trees to determine the order of population splitting, even though 
they appear hopelessly scrambled and far from monophyly. Takahata (1989) showed 
that the phylogeny of closely related populations could nonetheless be extracted 
from such trees. The key observation is that populations must have split after the 
common ancestor of any allelic lineages found in those populations has split. This 
restriction requires that descendent populations have split after the most recent split 
between genetic lineages found in two different populations. 

The key event for this problem is the most recent node connecting lineages 
found in two different populations in the trees, in this case the Darwin and Melville 
Island populations (Fig. 9.6). When the probability that at least one observed inter- 
populational coalescent event occurred after the third (Cobourg) population split 
but before the Darwin and Melville split is large (as it is with the sample sizes used 
in the study), the condition specifying high consistency of the allele and organismal 
trees is met, and the temporal order of interpopulational coalescent events is the 
same as that for populations. The scenario implied by this analysis is one in which 
the island population was not the first to bud off, which might suggest a less impor- 
tant role for water barriers in the diversification of these populations. 


3. Hybridization 


Hybridization is another important way in which population-level processes can 
influence higher level systematics (reviewed in Moore, 1995). Flow of nuclear or 
mitochondrial genes between taxa can, with inadequate sampling of populations 
and loci, prevent accurate reconstruction of evolutionary history; although the gene 
tree may be correctly inferred, the species tree will not (Fig. 5D). Organelle ge- 
nomes are particularly susceptible to flow between species during hybridization 
because they will flow across taxa unconstrained by physical linkage to nuclear loci, 
although epistatic interactions with nuclear loci may impede flow (Barton and 
Jones, 1983; Harrison, 1989). Discordances between gene and species phylogenies 
via hybridization appear to be much less common in animals than in plants, where 
interspecific capture of foreign organelles (1.e., “chloroplast capture"; Reiseberg and 
Soltis, 1991) has led to major phylogenetic conflicts with other evidence. The best 
example in birds is likely Degnan's (Degnan and Moritz, 1992; Degnan, 1993) nu- 
clear and mtDNA surveys in silvereyes (Zosterops); Degnan convincingly demon- 
strates that the mitochondrial tree can misrepresent the tree of nuclear loci, warning 
against sole reliance on this molecule for systematic purposes. 


B. Permanent Effects of Incomplete Lineage 
Sorting and Hybridization 


There is a suspicion among systematists that effects such as incomplete lineage sort- 
ing and hybridization are temporary, and that sequences sampled at highly diverged 
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phylogenetic levels will be immune to them. This is not the case. Alleles with a 
history that is discordant with that of their respective populations nonetheless have 
just as much chance of reaching fixation as concordant alleles. This scenario would 
cause the gene tree to misrepresent the species tree permanently. Such effects are 
particularly important in studies of closely related species; here the length of the 
internode comprises a substantial fraction of the length of the entire tree (Fig. 5B); 
under some rare conditions, this branch might actually be deemed significant by a 
bootstrap (Felsenstein, 1985) or other test. If such a scenario occurs, however, it is 
never erased. No matter how long the three populations have been diverged from 
one another, in principle, the gene tree will always misrepresent the species tree. 
However, as time increases, the length of the internode becomes trivial compared 
to those leading to the tips, and, in practice, the recovered phylogeny resembles 
a star phylogeny (Fig. 5C). Thus, if one wished to determine the phylogeny of a 
monophyletic trio of species that had been diverging for millions of years, contrary 
to intuition, there is a chance that the true gene tree would misrepresent the true 
species tree even in this case. There is such risk in all single gene trees of higher 
taxa (Edwards et al., 1991; Helm-Bychowski and Cracraft, 1993), but our ability to 
distinguish the true gene tree from an observed trichotomy is vanishingly small 
(DeSalle et al., 1994). Furthermore, in order for the discordant gene tree to be 
realized, sampling must take place prior to any extinction of species, as extinction 
can restore concordance. Using multiple loci to assess the higher level tree will 
nearly always improve resolution and ability to infer the species tree. 

Like lineage sorting, the imprint of hybridization on higher level phylogenetic 
trees can be long lasting but can also be erased by extinction of the species into 
which mtDNA has flowed. If foreign mtDNA somehow invades and takes over 
mtDNA “native” to a particular species, the mtDNA of that species will continue 
to yield atypical results (Fig. 5E and F) until hybridization with closer relatives or 
extinction of the lineage might possibly restore concordance. Several higher level 
molecular trees in birds are suspected to have been influenced by past hybridization 
events (Crow et al., 1992; Avise et al., 1990), and the lingering possibility of this 
phenomenon, even when working far above the species level, should compel work- 
ers to score nuclear loci simultaneously, either indirectly via the phenotype, or more 
directly, via nuclear markers (Moore, 1995). 


C. Dating Biogeographic Events: 
Coalescence in Subdivided Populations 


Population structure can leave its imprint on analyses above the species level 
through effects other than lineage sorting and hybridization. It has been known for 
more than 10 years that the total divergence between two species, as measured by, 
say, the average pairwise divergence between lineage tips in the two species (ô), 
includes both the divergence of alleles between the species that has accumulated 
after lineage splitting as well as the divergence of alleles within the common ances- 
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tral species. A suggested measure of interspecific divergence (6) employing a cor- 
rection for that component of divergence caused solely by diversity within the an- 
cestor is 


8 = 8, — 0.56, + 8,) (1) 


where 6, and 6, are the average divergence of alleles within species x and y (Ste- 
phens and Nei, 1985; Wilson et al., 1985). As the equation implies, the correction 
becomes more important as diversity within the ancestor (as estimated by observed 
diversity within the two descendent species) increases. Although this correction be- 
comes less important as the divergence time of species becomes large, for recently 
diverged species there is a risk of significantly overestimating the divergence time 
when species divergence (6) is equated with gene divergence (6,,). 

The magnitude of this error for biogeographic studies employing a molecular 
clock of mtDNA, such as those testing models of Pleistocene speciation (Berming- 
ham et al., 1992; Zink and Slowinski, 1995), can be estimated by examining the 
range of within-species mtDNA diversity and comparing this to typical interspecific 
divergences. Moore (1995) compiled such data (primarily for North American spe- 
cies assayed via restriction enzymes) and showed that the average maximum branch 
length within species was 0.007 substitutions per site, yielding an average depth for 
intraspecific trees of 0.0035 substitutions per site, or approximately 350,000 years. 
Thus, estimates of interspecific divergence will in general be overestimates by about 
this amount without correction for ancestral diversity. A list of minimum values for 
ô (for nominate species and subspecies) for 18 genera of North American birds has 
a range of «0.001 (subspecies of red-winged blackbirds, Agelaius phoeniceus) to 0.09 
(species of sandpipers, Calidris). Although many estimates of 6,, for North Ameri- 
can birds are large, suggesting gene divergence long before the Pleistocene (e.g., >2 
million years ago), the mtDNA of a variety of avian species pairs reveals smaller 6,, 
values (Bermingham et al., 1992; Zink and Slowinski, 1995). If any of the ancestors 
of these species pairs had coalescence times approaching 350,000 years, then the 
fraction of ô, attributable to ancestral diversity could be quite large. 

The effect of intraspecific diversity on estimated dates of interspecific divergence 
will increase as the ancestral species becomes more structured. This is because the 
effective population size of a species structured into many semiisolated demes is 
greater than the sum of the effective sizes of those demes. Hence the within-species 
diversity of a structured species will also be disproportionately large. Nei and 
Takahata (1992), building on Wright (1943), Maruyama (1970), and Slatkin (1991), 
rederived the quantity for the effective size ofa species (N,) subdivided into n semi- 
isolated demes, each of size N. It is 


| (n — x 
N. = Nal 1 + ——— (2) 
4Nmn? 


where Nui is the level of gene flow (in migrants per generation) between subpopu- 
lations in a finite island model. When the ancestral species 1s structured, for example 
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when Nm is low (0.001) and the number of demes high (32), the coalescence time 
is on average more than 400 times the coalescence time ofa similarly sized panmictic 
species (INei and Takahata, 1993)! This makes intuitive sense, since the demes com- 
prising a highly structured species are nearly independent of one another; the low 
level of gene flow just prevents them from diverging ad infinitum. Values for 6, in 
such a species will be similarly inflated over those values in a panmictic species of 
the same total size. Thus, possible structuring in the species from which contem- 
porary species diverged will exacerbate the failure to correct interspecific distances 
for divergence within the ancestor. 

A more intuitive way of expressing the increase in coalescence time in a struc- 
tured species is in terms of F,, since this is what is typically measured. We can do 
this by noting an equation for N, of a structured species nearly identical to Eq. (2) 
for a large number of demes (n; Felsenstein, 1992): 


aa ee а (3) 
4m 
- nal eee! 2 (4) 
4Nm n 
4Nm + m 
EN e—À— d (5) 
4Nm 


When n is large, (n — 1)/n approaches 1. So, we can substitute in for F, by noting 
that F,, in an island model is often expressed as 1/(1 + 4Nm). This leaves us with 


This means that the effective size of a structured species is inflated by a factor 
1/1 — ЕБ.) over a panmictic one of similar total size (Wright, 1943). 

This formulation provides a new perspective on uses ofa continent-wide survey 
of control region sequences in the gray-crowned babbler (Edwards, 1993a; Fig. 9.7) 
to date an important vicariant barrier for this and other species in Australia. The 
eastern and western lineages of gray-crowned babblers diverged across а well- 
known biogeographic barrier in northeastern Australia known as the Carpenterian 
barrier. While this barrier is considered quite old on the basis of the relative timing 
of taxa diverging across it (Cracraft, 1986), there have been no estimates of its rela- 
tive or absolute age from molecular data. Average sequence divergence in region I 
between the eastern and western lineages across this barrier is 8.3% (Edwards and 
Kot, 1995). If the rate of divergence between the two babbler lineages in region I is 
11-17% per million years as in humans (Nei, 1993; but see Mindell et al., 1996), 
this suggests a divergence time of about 440,000—680,000 years ago (= genetic dis- 
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Gray-crowned Fg = 0.68 Red-breasted Fp = 0.50 
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FIGURE 9.7 Gene genealogy of region I (control region) sequences from 12 populations of gray- 
crowned babblers (P. temporalis) across Australia (Edwards, 19932). The common names of the two major 
lineages within temporalis are indicated, along with estimates of Е, for this region for each group. The 
estimated average pairwise sequence divergence within (6,, 5,) and between these groups (6,,) is indi- 
cated. White-browed (P. superciliosus) and Hall's (P. halli) sequences were used as outgroups (Edwards, 
1992). The tree was constructed by the neighbor-joining method (Saitou and Nei, 1987) using gamma 
distances with а = 0.5. 


tance/rate/2), without any correction for possible diversity in the ancestral popu- 
lation. However, if diversity and genetic structure of the ancestor were similar to 
that of contemporary lineages, then it would be high: F, for region I within the 
eastern and western lineages of 0.68 and 0.50, respectively (Fig. 9.7). This means 
that 6, and 6,, for the eastern and western lineages, are about three and two times, 
respectively, what they would be were these lineages panmictic, and the differences 
in coalescence times within each lineage appear to reflect this pattern (Fig. 9.7). The 
average intraspecific divergences are 3.7 and 2.3%, respectively (Edwards and Kot, 
1995), yielding a corrected between-lineage divergence of 5.3%, or only 275,000— 
425,000 years. These times differ from the uncorrected value by more than 35%. 
Clearly, estimating the times of recent biogeographic events such as the Carpenter- 
ian break analyzed here requires correction for intraspecific diversity. 

The logic employed in corrections for ancestral diversity is that population struc- 
ture and diversity levels have stayed roughly the same during the evolution of the 
two descendent species. This is highly unlikely for many species pairs, and may well 
be misleading. Indeed, the short distance between the base of each babbler lineage 
and their common ancestor implies that the corrected divergence time would have 
occurred after all alleles had coalesced within the gray-crowned and red-breasted 
lineages—highly implausible unless a bottleneck or some other departure from 
demographic stasis occurred (J. Felsenstein, personal communication). In addition, 
in many cases the overestimate of interspecific divergence by gene divergence might 
be swamped out by the errors associated with the molecular clock itself. The num- 
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ber of intraspecific gene trees that could be useful for such purposes in birds is 
growing (e.g., Zink, 1993; Zink and Dittmann, 1993a,b; Joseph and Moritz, 1994), 
and in the future, multiple population samples of sequences will considerably im- 
prove our picture of the evolution of intraspecific diversity as well as better refine 
our estimates of divergence times between species (e.g., Wood and Krajewski, 
1996). 


D. Selection 


It is sometimes stated that the use of markers that are under the influence of natural 
selection, whether morphological or molecular, is taboo for systematists interested 
in retrieving the true organismal tree. Admittedly, however, in the past this has been 
a vague rule of thumb: rarely is there any quantitative method applied to the deci- 
sion to invoke selection for a particular character (much less what type of selection), 
and it is often unknown exactly what the phylogenetic consequences of that selec- 
tion will be. These are some of the reasons why the habit of disregarding characters 
under selection a priori has been questioned. Generally, it is possible that a systema- 
tist might avoid characters influenced by positive Darwinian selection because the 
phylogenetic signal might be obscured: selection may accelerate the rate of change 
for a character, resulting in excess homoplasy, and selection may increase homoplasy 
by directly causing convergence or parallelism in unrelated lines. 

Although there are abundant examples in birds of false phylogenetic trails being 
left by morphological characters under selection (Mayr, 1963), there have been few 
similar claims for molecular characters. Nonetheless, with more diversified molecu- 
lar and statistical techniques (Tajima, 1989; Golding and Felsenstein, 1990; Gold- 
ing, 1994), there are an increasing number of examples of natural selection at the 
molecular level. In particular, polymorphisms in genes of the major histocompati- 
bility complex (MHC) of vertebrates suggest that systematists might avoid using 
characters under selection for yet another reason: balancing selection (heterozygote 
advantage) can create a situation in which the true gene tree (allelic genealogy), 
even if correctly reconstructed, often would not reflect the tree of species splitting 
(“organismal” tree). 

MHC molecules are glycoproteins that bind antigenic peptides from bacteria 
and pathogens and present these to T cells for initiation of the immune response. 
Several decades of intense molecular and comparative research have yielded a de- 
tailed picture of the causes and consequences of variability in those portions of 
MHC molecules that specifically bind foreign antigens—the antigen-binding sites 
(ABSs). MHC loci are now considered the most extreme example of natural se- 
lection at the molecular level in vertebrates (reviewed in Klein et al., 1993; Hed- 
rick, 1994). 

The particular type of selection at MHC loci, namely balancing selection (het- 
erozygote advantage or frequency-dependent selection), has the effect of maintain- 
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ing alleles for extremely long periods of time in populations; it makes alleles more 
resistant to extinction via genetic drift. The result is not only an increase in intra- 
specific allelic diversity but a long life span of alleles and a higher incidence of alleles 
shared between long-separated species; this maintenance of ancestral polymorphism 
at MHC loci across speciation events is predicted to occur if balancing selection is 
intense enough (Klein et al., 1993). Many examples of maintained ancestral poly- 
morphism have been documented at mammalian MHC loci (e.g., Edwards et al., 
1997), with some alleles apparently having been maintained up to 40 million years 
(Klein et al., 1993). Some of this apparent ancestral polymorphism is more likely 
convergence (i.e., incorrect phylogenetic reconstruction) in long-separated lineages 
(Hughes et al., 1994; Takahata, 1994), and there are reasons other than selection 
(recombination, intra- and interlocus gene conversion) explaining why MHC phy- 
logenies often appear so scrambled (She et al., 1991; Gyllensten et al., 1991); thus 
MHC loci are excellent examples of characters that systematists interested in an 
organismal phylogeny would want to avoid both because natural selection produces 
phylogenetic results that are inconsistent with other data as well as because the phy- 
logenetic signal embedded in them is obscured. They are also excellent examples of 
lower level processes influencing higher level phylogenetic trees. 

The cloning and comparative analysis of MHC class П genes in birds (Edwards 
et al., 1995a,b) illustrates these points. One popular way of detecting selection at the 
ABS of MHC loci is to find evidence for elevated rates of nonsynonymous (amino 
acid changing) substitutions (Hughes and Nei, 1988). Such rates are expected to 
increase under balancing selection because alleles with new amino acid sequences, 
and hence with new capacities for binding foreign peptides, will be advantageous 
when rare or when individuals are heterozygous. Edwards et al. (1995b) tested for 
the action of balancing selection at the ABS of MHC sequences amplified from 
three songbirds; the results suggested that, as in mammals, MHC loci in birds are 
subject to the action of balancing selection. To extend these results, phylogenies of 
three functional domains of the MHC sequences amplified from songbirds were 
built using a chicken sequence as an outgroup. At face value, the results suggest that 
ancestral polymorphisms may have been maintained for portions of the antigen- 
binding site, but that other portions of the molecule, such as the anchor exons 
(exon 3), exhibited no evidence for this effect (Fig. 9.8). Assuming the gene trees 
are correct, such a result can be explained by invoking recombination between 
functional domains ofthe MHC. The phylogenetic signal in the domains subject to 
selection (a helix and В sheet) was strong enough to reject the topology ofthe non- 
trans-species tree exhibited by exon 3 (Fig. 9.8). However, the divergence times of 
the species in that study, which included representatives of two major songbird 
groups, namely Passerida and Corvida (Sibley and Ahlquist, 1990), were quite large, 
suggesting that possibly the phylogenies reflected more convergence in the ABS 
than maintained alleles. That the variability in the third positions of codons also 
reflects the conflict between phylogenies of subdomains argues against the conver- 
gence hypothesis; on the other hand, the extremely skewed base composition of 
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FIGURE 9.8 Phylogenetic trees (Swofford, 1991; Saitou and Nei, 1985) of MHC class II B (В chain) 
sequences amplified from red-winged blackbirds {Agelaius phoeniceus), western scrub jays (Aphelocoma 
coerulescens californica), and house finches (Carpodacus mexicanus). Each sequence is designated by a species; 
the number represents the individual and clone (e.g., 3.3 indicates clone 3 from individual 3). (A) Tree 
of the B sheet, exon 2; (B) Tree of the a helix, exon 2; (C) Tree of exon 3. Pro 105, Insertion of a 
proline codon at position 105 in the blackbird and finch sequences relative to the jay and chicken se- 
quences. See Edwards et al. (1995b) for details. 


third positions of avian MHC class II genes makes convergence even at these posi- 
tions more likely (Edwards et al., 1995b). The uncertainty as to the gene and locus 
relationships of the amplified sequences further complicates the picture. Thus, al- 
though MHC genes are undoubtedly excellent examples of the tight link between 
population level processes and higher level systematics, the precise nature of the 
link, at least for birds, is still obscure. 


IV. CONCLUSION 


An eminent molecular evolutionist of birds once challenged his upper level molecu- 
lar evolution class with the question, “Can systematists working far above the spe- 
cies level effectively ignore processes operating below the species level?" As usual 
with such pop questions, the class squirmed uneasily, looking away or at each other 
or any place other than the front ofthe room, even though the instructor had spent 
most of the lecture championing the affirmative. This chapter has attempted to 
review the evidence and marshal new perspectives championing the negative. Not 
only can higher level systematists not ignore population processes in their quest for 
the major branches of avian phylogenetic trees—they are effectively studying such 
processes at the same time. Of course, the instructor (Wilson et al., 1985), as well as 
numerous other geneticists (Avise ef al., 1987; Moritz et al., 1987; Crozier, 1990), 
had proffered the same perspective in their writings. For this reason I believe A. C. 
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Wilson's question was primarily meant to challenge and stimulate, as was so often 
the case. 

It is likely that the reversed question— Can avian population geneticists effec- 
tively ignore patterns occurring far above the species level?—can be answered just 
as forcefully in the negative. For example, embedding the focal species in its appro- 
priate phylogenetic context is often the only way of determining the direction of 
evolutionary trends and of adding rigor to the writing of "evolutionary chronicles" 
(O'Hara, 1988). Molecular data, particularly those bearing on questions of selection 
and gene genealogies in structured populations, will likely continue to play an im- 
portant role in fusing these two levels. 
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I. INTRODUCTION 


The theory and methods of phylogenetic reconstruction have improved dramati- 
cally in the last 20 years, leading to the next logical step in evolutionary biology: 
the use of phylogenies to interpret historical trends in ecology, behavior, and mor- 
phology. This new field is termed "historical ecology" (e.g., Brooks, 1985). It en- 
compasses two interrelated areas: ecomorphology, in which morphological evolution 
is interpreted in terms of ecology, and ecophylogenetics, in which ecology and be- 
havior are interpreted in light of phylogeny. Historical ecology relies on the compara- 
tive method (e.g., Lorenz, 1950; Tinbergen, 1964; Ridley, 1983; Pagel and Harvey, 
1988), which may be divided into two main approaches in this context. The first is 
the homology approach (Coddington, 1994), in which characters are optimized on (or 
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in) phylogenetic trees and their patterns of change are observed for insight into 
evolutionary processes (summarized in Brooks and McLennan, 1991). The second, 
the convergence approach, began with statistical efforts to remove phylogenetic bias in 
comparisons of ecological characteristics among taxa (summarized in Harvey and 
Pagel, 1991), and now encompasses a wide range of methods for studying repeated 
evolutionary patterns in a phylogenetic context (e.g., Pagel, 1994). In this chapter 
we emphasize a combined approach, in which hypotheses of evolutionary processes 
(such as adaptation and phylogenetic constraint) are tested by optimization of ap- 
parently convergent characters in multiple phylogenetic settings. 

The development of historical ecology may be divided into three stages. The 
first was the conception, in principle, of the value of phylogeny in interpreting 
the evolution of ecological and behavioral characteristics. Brooks and McLennan 
(1991) provide a review of this stage. It began with Darwin and reached a particu- 
larly rich period in the 1940s and 1950s with the development of modern ethology 
(Lorenz, 1941, 1950; Tinbergen, 1953, 1964). However, emphasis on historical 
interpretation waned in the 1960s and 1970s as ecologists began to emphasize local 
processes (e.g., physical environment, resource distribution, competition, and pre- 
dation) as the determining forces in community development (e.g., Ricklefs, 1987) 
and systematists questioned the identification of behavioral homology (e.g., Atz, 
1970). Ultimately, this “eclipse of history” (Brooks and McLennan, 1991) stemmed 
from a lack of accurate phylogenies for ecological inference. 

The refinement and promulgation of cladistics and the introduction of quanti- 
tative molecular methods in the 1970s permitted movement to the second stage 
of historical ecology in the 1980s. Rigorously constructed, increasingly accurate, 
phylogenies became available to interpret ecological and behavioral patterns. Dur- 
ing this period, the introduction of microcomputer parsimony programs, namely, 
MacClade (Maddison and Maddison, 1992) and PAUP (Swofford, 1993), facilitated 
the process of mapping and observing character change and provided simple meth- 
ods for testing null models of phylogenetic effects (e.g, Maddison and Slatkin, 
1991). The period also saw phylogenetic rationale, and consequently greater rigor, 
applied to the definition or understanding of previously slippery concepts, espe- 
cially homology (e.g., Patterson, 1982, 1988), adaptation (e.g., Gould and Vrba, 
1982; Coddington, 1988; Baum and Larson, 1991), and phylogenetic constraint 
(e.g., McKitrick, 1993). Most importantly, during this period the number of em- 
pirical studies of historical ecology increased dramatically. 

The explosive growth of historical ecology has been a source of opportunity and 
frustration for ornithologists. Because more ecological and behavioral data are avail- 
able for phylogenetic interpretation in birds than in any other major group of or- 
ganisms, ornithologists have been leaders in empirical ecophylogenetics and eco- 
morphology (e.g., Höglund, 1989; Prum, 1990, 1994; Björklund, 1991; Lanyon, 
1992; McKitrick, 1992; Richman and Price, 1992; Edwards and Naeem, 1993; 
Moreno and Carrascal, 1993a,b; Winkler and Sheldon, 1993; Hóglund and Sillén- 
Tullberg, 1994). However, the number of avian phylogenetic studies has not kept 


10 Phylogeny and Ecology 281 


pace with ecological studies, and this disparity has created a temporary crisis. It has 
led, in particular, to a range ofstandards concerning the importance of phylogenetic 
accuracy to ecological interpretation, with the consequence that at least a few or- 
nithologists have undertaken ecophylogenetic studies using unsubstantiated or ob- 
viously inaccurate phylogenies (e.g., McKitrick, 1992; Harvey and Nee, 1994; 
Moller and Birkhead, 1994). In addition, as a consequence of initial enthusiasm, 
there has been a tendency among historical ecologists to overinterpret the signifi- 
cance of patterns of character change. We often hear, for example, that such-and- 
such a change is "adaptive," or that a given case of stasis is the result of “phylogenetic 
constraint." But the demonstration of such processes is extremely difficult. We are 
at a point where the rationale, methods, and statistics of historical ecology lag far 
behind our enthusiasm or empirical capabilities. 

This brings us to the beginning of the third stage of historical ecology. The field 
has been through an initial period of theoretical and empirical development, and 
now it is time to reassess where it has been and where it needs to go. Such an 
assessment is difficult because it requires a consideration of all elements of historical 
ecology, from the gathering of phylogenetic and ecological data, to the detection 
of patterns in the data, to the explanation of evolutionary processes responsible for 
those patterns. Each of these elements has a large and often contradictory literature, 
and the participants in historical ecology come from widely differing backgrounds 
in systematics and ecology. Nevertheless, there are some basic themes running 
through historical ecology that permit the practical and philosophical issues to be 
sorted and assessed. Among these are the tenets that (1) no phylogenetic approach 
or statistical method is universally correct or appropriate to a problem, (2) all terms 
must be defined and understood equally by all participants in historical ecology, and 
(3) patterns and processes must be demonstrated rigorously. 

Working within these guidelines, our aim is to help set the stage for the next 
period in historical ecology. Using examples mainly from projects on which we 
have worked, we explore three fundamental issues: phylogenetic accuracy, adapta- 
tion, and phylogenetic constraint. These issues are central to historical ecology and 
enable us to cover a broad range of subjects in the field. In the end, we hope to 
identify areas in which historical ecology has made substantial advances, areas where 
there is hope of discovery, and areas where progress will be more difficult. 


П. ACCURATE PHYLOGENETIC ESTIMATES 
AND HISTORICAL ECOLOGY 


The most important step to successful historical ecology is the collection of appro- 
priate, accurate, ecological and phylogenetic data. Although obvious, this assertion 
is remarkably underemphasized in ecophylogenetic studies. Emphasis is placed pri- 
marily on the quality of ecological data, which are viewed as dependent variables, 
instead of on the accuracy of phylogeny, which is generally viewed as an indepen- 
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dent variable without error (Lanyon, 1993). However, phylogenetic estimates have 
error distributions (Lanyon, 1993; Miyamoto and Fitch, 1995), but these are com- 
monly ignored because they are too complicated to quantify. 

When phylogenetic error is recognized in historical ecological studies, it is often 
rationalized or discounted. Some historical ecologists feel that if comparative studies 
include enough ecological data and phylogenetic estimates for a wide array of taxa 
evolutionarily informative patterns will emerge regardless of errors in specific trees. 
Other historical ecologists, notably cladists, feel that the accuracy of a tree is moot 
because a tree is a hypothesis and simply the best that one can do with given data. 
Thus, all historical analyses are, in some sense, preliminary, and the accuracy of the 
hypothesis will increase with the collection and comparison of additional data. This 
is a reasonable view. However, when a preliminary phylogenetic hypothesis is 
clearly inaccurate, although rigorously constructed, it is not an acceptable premise 
to ecophylogenetic analysis. 

This disparity in views on the importance of phylogenetic accuracy is compli- 
cated by arguments among systematists who support different methods to assess 
accuracy. These arguments center around two alternative approaches: taxonomic con- 
gruence (the consensus approach), in which trees derived from distinct phylogenetic 
data sets are compared for agreement in branching patterns (e.g., Cracraft and Min- 
dell, 1989; Bledsoe and Raikow, 1990; Miyamoto and Cracraft, 1991; Sheldon and 
Bledsoe, 1993), and character congruence (the combined approach), in which all data sets 
are combined to produce a single best estimate of phylogeny based on “total evi- 
dence” (e.g., Miyamoto, 1985; Cracraft and Mindell, 1989; Kluge, 1989; Kluge 
and Wolf, 1993). 

Both approaches have appeal, but both have limitations. In taxonomic congru- 
ence analysis, if two data sets produce trees that concur, there is strong probabilistic 
support of phylogeny (e.g., Miyamoto and Fitch, 1995). However, if branching 
patterns disagree, then a decision must be made as to which data set provides the 
better estimate of phylogeny, and this decision is subjective. In character congru- 
ence, the single best estimate of phylogeny is based on evidence, as opposed to 
inference (Kluge, 1989; Kluge and Wolf, 1993). However, if one (or all) of the 
combined data sets are “positively misleading” (Felsenstein, 1978), an incorrect 
tree may be produced, and faith in such a tree will be misplaced (Bull et al., 1993; 
de Queiroz, 1993; Miyamoto and Fitch, 1995). Of course, when one (or more) of 
the alternative data sets consists of obligate distances (as produced by microcomple- 
ment fixation or DNA hybridization), the test of congruence must be taxonomic 
congruence, because distances cannot be combined with character data. Thus, taxo- 
nomic congruence is an important method in avian systematics because Sibley and 
Ahlquist (1990) compared a wide variety of groups by DNA hybridization and 
many historical ecologists employ their phylogenetic estimates (e.g., Moreno and 
Carrascal, 1993a,b; Harvey and Nee, 1994; Møller and Birkhead, 1994). 

We believe that historical ecologists must identify strong and weak parts of phy- 
logenetic estimates before attempting ecophylogenetic analyses. Otherwise they 
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will not be able to differentiate between likely and only possible scenarios of char- 
acter evolution. The assessment of branch support requires an understanding of 
tree-building and testing methods and a reevaluation of all pertinent data. However, 
it does not require that ecologists collect phylogenetic data, as suggested by Kluge 
and Wolf (1993). To do so would be a highly inefficient use of time and skills in this 
age of specialization. 

For single data sets, branch robustness may be determined by approaches such as 
bootstrapping (Felsenstein, 1985), jackknifing (Lanyon, 1985), and decay analysis 
(e.g., Bremer, 1988); in addition, overall phylogenetic information content can be 
assessed via such tests as Hillis's (1991) g test. Multiple data sets should be analyzed 
both for character and taxonomic congruence. Because data are lumped in character 
congruence, tests of tree quality are the same as for a single data set. Taxonomic 
congruence may be assessed descriptively by simple comparison of trees produced 
by different data sets, or statistically using randomized data [e.g., component analysis 
(Page, 1993)] or data error distributions (e.g., Templeton, 1983; Kishino and Hase- 
gawa, 1989). In any event, if substantially different trees are produced by differ- 
ent data sets, then the single tree produced by character congruence should be 
viewed cautiously because one (or all) ofthe data sets has problems (Bull et al., 1993; 
de Queiroz, 1993). Lanyon (1993) suggested that in this situation it is useful to 
produce a strict or majority-rule consensus tree for each data set, in which only 
strongly supported branches are resolved. Consensus trees from different data sets 
should then be compared to one another for congruence. Although the consensus 
trees from individual data sets may not be well resolved, in many cases when they 
are compared to one another, resolved parts in one tree may complement unre- 
solved parts in other trees. The result is a reliable "phylogenetic framework" for 
historical ecological study. 


A. An Example of the Phylogenetic 
Framework Approach 


In Fig. 10.1, we provide a simple example of how published data sets may be com- 
pared for taxonomic congruence to produce a phylogenetic framework (Lanyon, 
1993). Figure 10.1A pairs two alternative trees that estimate the phylogeny of day 
herons (Ardeidae: Ardeinae). Tree (a) is the best-fit DINA hybridization tree from 
Sheldon (1987a,b); tree (b) is the most parsimonious Wagner tree based on the 
osteological study of Payne and Risley (1976). By simple congruence analysis, the 
two trees disagree in several respects. For example, they conflict in the placement 
of the cattle egret (Bubulcus ibis). From the perspective of an historical ecologist, this 
is unfortunate because the cattle egret is an interesting heron in that it is an upland 
feeder with apparent adaptations to that life style (e.g., several osteological charac- 
ters associated with its short legs, neck, and bill; Payne and Risley, 1976). To under- 
stand the evolution of these characters requires that they be compared to those of 
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A 


a Syrigma sibilatrix b 
Egretta thula 


Egretta caerulea 
Butorides striatus 
Bubulcus ibis 


Casmerodius albus 


Ardea herodias 


[er] 


с Syrigma sibilatrix d 
Egretta thula 


Egretta caerulea 00 
Bubulcus ibis 00 
Casmerodius albus 

Ardea herodias 

Butorides striatus 


FIGURE 10.1 (A) A taxonomic congruence assessment of day heron phylogeny featuring (a) the best- 
fit DNA hybridization tree of Sheldon (1987a, Fig. 1) and (b) the most parsimonious Wagner tree of 
Payne and Risley (1976, Figs. 34 and 35), based on osteological characters. (B) A congruence assessment 
following reanalysis of the original data sets: (c) DNA hybridization data have been assessed by jack- 
knifing taxa and bootstrapping distances to produce a strict consensus tree (as described in Sheldon and 
Winkler, 1993); (d) osteological data were subjected to modern analysis using global parsimony in PAUP 
(Swofford, 1993), multiple (instead of composite) outgroups, and bootstrapping (Felsenstein, 1985) to 
produce a majority rule tree (K. McCracken and F. H. Sheldon, unpublished analysis). Values on tree (d) 
represent percentage bootstrap support of branches. 


the cattle egret's sister taxon and closest relatives (Sheldon and Gill, 1996). The 
identification of these relatives, in turn, requires the knowledge of day heron phy- 
logeny. However, from the available studies, we do not have a good picture of day 
heron phylogeny. 

In Fig. 10.1B, we compare trees from the same data sets after having tested the 
robustness of their branching patterns. To construct the DNA hybridization tree (c), 
we jackknifed (Lanyon, 1985) and bootstrapped (Krajewski and Dickerman, 1990) 
the data, and for the morphological tree (d), we used PAUP (Swofford, 1993) and 
bootstrapping (Felsenstein, 1985). The result of the reanalysis is that the morpho- 
logical tree is much more congruent with, and helps to resolve, the DNA hybrid- 
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ization tree. Conversely, the DNA hybridization tree fortifies several key branches 
of the morphological tree. The only disagreement that remains is over the position 
of the whistling heron (Syrigma sibilatrix); one or both trees incorrectly place this 
species. Most importantly, we have resolved the sister relationship of the cattle egret 
and, thus, have a firmer phylogenetic framework for a comparative morphological 
study of its upland habits. 


III. APPLICATION OF THE COMPARATIVE APPROACH 
TO CLASSIC PROBLEMS OF EVOLUTION 


Given the problems inherent in phylogenetic estimation and the energy and effort 
required to gather complete ecological, behavioral, or morphological data for any 
sizable set of taxa, it would seem that the most difficult part of historical ecology is 
the initial accumulation of basic natural history data. However, as we have already 
suggested, it is equally difficult to detect meaningful patterns of character change 
from the basic data and far more difficult to demonstrate the causes of these patterns. 
To illustrate these difficulties, we discuss them in terms of the two evolutionary 
questions most frequently addressed via the comparative method: adaptation and 
phylogenetic constraint. 


A. Adaptation 


Gould and Vrba (1982), Coddington (1988, 1994), Baum and Larson (1991), and 
others have helped to develop a definition of adaptation in the context of phy- 
logeny. An adaptation is an apomorphic feature that evolved in response to an apo- 
morphic function (Coddington, 1994). It has current utility and was generated his- 
torically through the action of natural selection for its current biological role (Baum 
and Larson, 1991). By extension, convergent characters are adaptations for the same 
function in distinct lineages (1.e., not synapomorphies and not accidentally similar). 

Despite (or perhaps because of) the increasing rigor in definition, it is extremely 
difficult to demonstrate adaptation through the comparative methods of historical 
ecology. Several authors suggest that hypotheses of adaptive character evolution can 
be tested directly through a phylogenetic examination of the character trait in re- 
lation to the selective environment. Coddington (1988) and Miles and Dunham 
(1993) argued that adaptation is evident when a trait change occurs at the same 
location within a phylogeny as the environmental change (Fig. 10.22), whereas 
Baum and Larson (1991) suggested that adaptation is evident only when the envi- 
ronmental change precedes the trait change (Fig. 10.2b). However, identifying a 
correlation between environmental and trait change is only part of the process of 
testing adaptation. A causal relationship must be demonstrated between environ- 


286 F. H. Sheldon and L. A. Whittingham 





FIGURE 10.2 Scenarios used to test adaptation and phylogenetic constraint. (a) The origin ofa novel 
trait г coincides with the origin of the novel environment e. This scenario is evidence of adaptation 
according to Coddington (1988) and Miles and Dunham (1993). (b) Trait t evolves subsequent to the 
environmental change e. This scenario is evidence of adaptation according to Baum and Larson (1991). 
(c) The origin of a novel trait t; coincides with the origin of another trait t». This is evidence that one 
character is constraining another (McKitrick, 1993) or that the development (adaptation) of one char- 
acter depends on the appearance of another, as in (a). (d) The origin of t precedes the origin of t>. This 
scenario refutes phylogenetic constraint of character t, by t; (McKitrick, 1993). 


mental and trait changes; that is, that natural selection is responsible for the estab- 
lishment and maintenance of a character. 

The difficulty in demonstrating adaptation stems from meeting all these require- 
ments. A causal relationship implies that the environmental change preceded the 
trait change. Thus, reliable paleoecological data must be available to characterize 
the environment, and a reliable phylogeny must be available to demonstrate timing. 
The phylogeny must have at least two lineages appearing after the environmental 
change, one with the putative adaptation and one without (e.g., Fig. 10.2b). Other- 
wise, it is impossible to determine unambiguously by parsimony whether the en- 
vironmental change occurred before or concurrently with the trait change. Not 
only must data be available on paleoenvironmental conditions and timing, but char- 
acter and environmental changes in intervening time must meet certain assump- 
tions. It must be shown that the adaptive trait was more beneficial than alternative 
character states that never spread or that disappeared through extinction (Dobson, 
1985). In addition, the trait must have been favored because it conferred its current 
adaptive function, i.e., is not an exaptation (Gould and Vrba, 1982). This implies 
that the selective environment has not changed substantially between the time when 
the trait arose and the present. Some biologists feel that this assumption is unlikely 
to hold in most cases (Frumhoff and Reeve, 1994). Others feel that it is reasonable. 


10 Phylogeny and Ecology 287 


Pagel (1994, p. 38), for example, noted it is a peculiar view of evolution to suppose 
*, , . that traits are labile and evolve for various functions until the organism chances 
upon using the trait for its current function, at which time there is no further modi- 
fication of the trait." 

Given the difficulty of determining past environmental conditions and organis- 
mal interactions, most inferences of adaptation quite reasonably rely on correlations 
of character changes in a phylogenetic context (e.g., Donoghue, 1989; Maddison, 
1990). A change in morphology, behavior, or ecology may be related (1) to a change 
in physical environment or (2) to a previous change in morphology, behavior, or 
ecology, which changes the functional environment. In principle, adaptation is 
demonstrated when it can be shown repeatedly that a particular character appears 
following the development ofa particular environment. Similarly, if it can be shown 
repeatedly that a particular trait appears following the development of another trait, 
then selection of the second trait has been shown to be favored by the appearance 
of the first trait (e.g., Donoghue, 1989; Prum, 1990). Unfortunately, most examples 
of character "correlation" consist of a single historical association between a trait 
and an environmental change, or between two trait changes. A single coincident 
event provides little evidence of a correlative relationship, let alone a causal rela- 
tionship. In contrast, cases in which character transformations occur in a predicted 
sequence in multiple independent lineages provide more substantial evidence of 
causal relationships (Frumhoff and Reeve, 1994; Leroi et al., 1994; Pagel, 1994). In 
such circumstances, statistical tests based on null models of chance occurrence (e.g., 
Ridley, 1983; Maddison, 1990) assume real power. Thus, to investigate hypotheses 
concerning the evolutionary sequence of traits, it is necessary to focus on traits that 
have multiple origins (1.е., convergent or homoplastic characters). Although such 
characters are a nuisance to phylogeneticists, they hold a wealth of information 
about evolutionary processes. 

Although repeated independent correlations among traits provide strong evi- 
dence for adaptation, selection is only one of several evolutionary mechanisms that 
can produce such patterns (Frumhoff and Reeve, 1994; Leroi et al., 1994). They can 
also occur as the result of interactive genetic processes such as pleiotropy or genetic 
linkage (see descriptions of how these genetic systems cause false correlations in 
Section IIL,B). Most studies offering adaptive explanations do so in the absence of 
direct evidence about selection or the genetic interaction among traits. Direct evi- 
dence of evolutionary mechanisms can be obtained only if (1) natural selection is 
assessed experimentally and (2) the phenotypic and genetic covariance structure 
among traits is measured. While such detailed investigations may provide secure 
conclusions about adaptation, they set a high standard and require a long-term com- 
mitment of energy and money. 

The comparative method can begin the process of distinguishing between cor- 
relations caused by natural selection, drift, and genetic interactions, if the nature of 
these forces is considered carefully. Drift is the easiest factor to discount in correla- 
tion analyses because it is not expected to cause consistent convergent changes in 
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multiple lineages. Selection and genetic interactions are more difficult to distin- 
guish, but the two may be teased apart by considering fundamental differences in 
how they produce environment/trait and trait/trait correlations. Selection moves 
relatively slowly; the environment changes first and subsequently the trait changes, 
or one trait changes and then subsequently the other changes. When genetic inter- 
actions are at play in trait/trait correlations, the effect is instantaneous because one 
gene acts directly on another (e.g., by pleiotropy). Thus, if trait/trait correlations 
appear simultaneously (i.e., on the same branch), genetic interactions are possible. 
If trait/trait correlations appear sequentially, e.g., following cladogenetic events, 
selection is a more likely explanation. A more difficult situation concerns the dis- 
tinction between selection and genetic interactions when the correlations are ofthe 
environment/trait variety. In such cases, genetic interactions can be viewed as a 
chain reaction, in which the environment affects an unrecognized trait that, in turn, 
affects the recognized trait. Thus, the timing of genetic interactions in environ- 
ment/trait cases is likely to be sequential and may be indistinguishable from the 
pattern produced by selection. Even so, it may still be possible to differentiate be- 
tween the two forces using the logic of Simpson (1944). Simpson noted that if the 
dependent (second) trait is highly consistent among clades, genetic effects are sug- 
gested because selection is a more haphazard process and would be expected to 
produce substantial variation. 

Prum (1990, 1994) discovered and outlined a convincing example of a causal 
relationship between two traits: multiple origins of elaborate plumage ornaments 
and display behaviors in the manakins (Pipridae). Prum predicted that if the derived 
display behaviors were distributed more generally among taxa than the plumage 
novelty, then the behavior evolved prior to the plumage, and the hypothesis that 
plumage has evolved as a consequence of the display would be corroborated. If the 
plumage were more generally distributed, then the opposite hypothesis would be 
supported. In manakins, derived male plumage traits have evolved subsequently to 
the behavioral novelties in which they are prominently featured. This implies that 
behavioral diversification is driving some aspects of morphological diversification 
within the family. 


Adaptations and Key Innovations: Parus Example 


In their review of the subject, Heard and Hauser (1995, p. 152) defined a key in- 


€ 


novation as “. . . an evolutionary change in individual trait(s) that is causally linked 
to an increased diversification rate in the resulting clade (for which it is a synapo- 
morphy)." A key innovation, therefore, would be a special case of adaptation: an 
environmental condition exists, a taxon acquires a trait that is selectively advanta- 
geous in that environment, and radiation ensues. The difference between an adap- 
tation and a key innovation is radiation; the key innovation must create circum- 
stances in which diversification increases in those lineages having the trait relative 


to those that lack the trait. The number of species would be expected to increase if 
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FIGURE 10.3 The phylogeny of Parus estimated by Sheldon et al. (1992) and Slikas et al. (1996). 
Subgenera that cache seeds are marked with asterisks. 


the fitness conferred by the trait increased the longevity or range of individual spe- 
cies, thereby creating opportunities for speciation by vicariance or dispersal, or di- 
minishing the likelihood of extinction (Heard and Hauser, 1995). 

An example of a putative adaptation that may also be a key innovation is seed 
caching in the genus Parus, chickadees and titmice (Sheldon and Gill, 1996). This 
hypothesis is based on the observation that Parus is divided into two lineages 
(Fig. 10.3). One of these (blue and great tits) consists of seven species (Eck, 1988), 
none of which is known to cache seeds. The other lineage consists of 23 species 
(Eck, 1988), all of which apparently cache seeds (e.g., Ekman, 1989). Thus, seed 
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FIGURE 10.4 5сепагіо required to support a hypothesis of adaptation, as in the case of seed caching 
by Parus. e is an environmental change hypothesized to be responsible for the selection of t. There must 
be multiple independent instances in which a trait arises in a given environment and results in radiation. 
Moreover, lineages in the same environment that lack the trait must be relatively species poor, as may 
lineages with the trait arising in different environments. 


caching is a synapomorphy for the speciose clade, and it seems to be largely respon- 
sible for the radiation of parid species in that clade. By ensuring a predictable supply 
of food in winter, seed caching would permit extensive exploitation of coniferous 
and deciduous forests. 

The test for this hypotheses is the same as any comparative test of adaptation, 
except that the trait in question must be associated with a relative increase in the 
number of species. Therefore, we need to examine caching and noncaching lineages 
in other groups that live in the same environment (Fig. 10.4). The environmental 
criterion is essential because the function for which the caching develops must be 
the same in all taxa. If it were not, we would not be examining the same phenome- 
non, and the hypothesis would likely be falsified. For example, if we referred to 
bowerbirds, some of which are known to cache (Pruett-Jones and Pruett-Jones, 
1985), we might not find that caching is correlated with diversification. But bow- 
erbirds live in tropical and subtropical forest, where the supposed advantages of 
caching to parids (i.e., winter food) do not apply. Similarly, we probably would not 
include cachers such as woodpeckers or squirrels in our comparisons, even though 
they may live in the same environment, because they are phylogenetically remote 
and many elements of their biology are bound to differ (Pagel, 1994). Instead, we 
should look to other temperate coniferous and deciduous forest passerines that 
cache: namely nuthatches (Sittidae) and crows, Jays, and nutcrackers (Corvidae). 
These are both oscine groups, but neither is sister taxon to the Paridae (Sibley and 
Ahlquist, 1990; Sheldon and Gill, 1996). Thus, they are relatively close genetically, 
but have acquired seed caching independently of the Paridae (i.e., seed caching is 
not a synapomorphy). Moreover, they share a similar range of nuclear DNA diver- 
gence with the titmice (са. 1— 496; Sibley and Ahlquist, 1990), suggesting a tempo- 
rally coincident radiation (given an approximate molecular clock) that is possibly 
driven by the same environmental forces as that of the Paridae. 
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Unfortunately, the phylogenies of sittids and pertinent corvids have not been 
well studied, and although knowledge of caching in certain taxa is extensive (e.g., 
nutcrackers; Balda and Kamil, 1989), information on the patterns and specifics of 
caching for many taxa is unknown. Moreover, the influence of habitat on the evo- 
lution of caching in all three families is also poorly known. Thus, considerably more 
work is required to test the hypothesis. However, the elements of a truly rigorous 
analysis of seed caching as an adaptation are in place. Similar possibilities for the 
rigorous study of adaptations and key innovations exist throughout the Passerifor- 
mes, as this order is rife with examples of multiple convergent evolution (e.g., to 
seed eating, nectivory, trunk probing, leaf gleaning, and flycatching) and subsequent 
radiation (e.g., Bledsoe, 1988; Sibley and Ahlquist, 1990; Sheldon and Gill, 1996). 


B. Phylogenetic Constraints 


In contrast to adaptive change, one may find that a certain trait occurs throughout 
a monophyletic group and varies little among members of that group. This char- 
acter uniformity may be attributed to phylogenetic constraint, which has been defined 
as "any result or component of the phylogenetic history of a lineage that prevents 
an anticipated course of evolution in that lineage" (McKitrick, 1993). That is, phy- 
logenetically constrained traits are expected, a priori, to vary in response to vari- 
able selection, but they resist adaptive modification as a result of inherited genetic 
conditions. 

The difficulty that phylogenetic constraint presents to historical ecologists is 
that phenotypic variation may be limited for several reasons, some of which may 
be considered phylogenetic constraint and some of which cannot (Edwards and 
Naeem, 1993; Frumhoff and Reeve, 1994; Leroi et al., 1994). Phenotypic variation 
may be limited because of (1) a lack of genetic variation, (2) pleiotropy, in which a 
single gene underlies the expression of several traits and the evolution of one of 
those traits can be constrained by selection on others, (3) gene linkage, in which a 
change in one gene may be restricted by its proximity to other genes on the same 
chromosome, and (4) stabilizing selection, in which a trait is maintained by selec- 
tion against alternative phenotypes. 

The first three limitations may be considered phylogenetic constraints; the lack 
of variation is caused by inherited genetic conditions whose momentum or com- 
plexity make evolutionary change unlikely or difficult to induce (phylogenetic iner- 
tia). In contrast, stabilizing selection maintains stasis on the basis of current eco- 
logical interactions and not common ancestry; hence we refer to it as an ecological 
constraint. We emphasize one further distinction. Although a trait may be correlated 
strongly with phylogeny (e.g., Winkler and Sheldon, 1993), such a correlation is 
not a demonstration of phylogenetic constraint. Instead, it is a phylogenetic effect (e.g., 
Miles and Dunham, 1993) because (1) there is no a priori expectation of variation 
or directionality and (2) the cause may be phylogenetic constraint, ecological con- 
straint, or a combination of the two. 
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Some authors have observed that all species of gulls in the genus Larus lay three 
eggs (Graves et al., 1984; McLennan et al., 1988), even though there may be selec- 
tive disadvantages to doing so (e.g., insufficient or overabundant food to feed three 
nestlings). Similarly, all species in the family Megapodidae incubate their eggs in a 
variety of habitats using heat sources other than body temperature (Ligon, 1993). 
On the basis of consistency in the face of expected variation, these authors conclude 
that these traits are phylogenetically constrained within these groups. The idea that 
character evolution is constrained by phylogeny, however, is only a hypothesis 
about evolutionary processes based on a pattern. To establish constraint, the hy- 
pothesis must be tested. 

Although Frumhoff and Reeve (1994) suggested that it is not possible to assess 
whether a character’s presence in extant taxa results from phylogenetic constraint 
on adaptive evolution, other workers have felt that a hypothesis of phylogenetic 
constraint is testable. McKitrick (1993) proposed an examination of the relative 
timing of the historical sequence of character evolution to test the hypothesis of 
phylogenetic constraint (e.g., Fig. 10.2). McKitrick maintained that a hypothesis 
of phylogenetic constraint is supported when two interrelated, mutually constrain- 
ing traits arise at the same point within the phylogeny (Fig. 10.2c). Alternatively, 
McKitrick argued that a hypothesis of constraint is falsified when the evolution of 
the supposed constrained trait precedes the evolution of the trait thought to con- 
strain it (Fig. 10.2d). Interestingly, these analyses of trait-appearance patterns are 
analogous to the tests that Coddington (1988) and Baum and Larson (1991) pro- 
moted to support hypotheses of adaptation (Fig. 10.2a and b, respectively). 

Such an approach to testing phylogenetic constraint is extremely difficult, if 
not impossible. The same test is used to demonstrate adaptation (an ecological 
phenomenon) and constraint (a phylogenetic phenomenon). This dilemma dem- 
onstrates the hierarchical interrelationship between adaptation and constraint. As 
Ligon (1993) pointed out, adaptation at one level is constraint at another. Indeed, 
the identification of a “key” adaptation, i.e., one that underlies a radiation, rests on 
the phylogenetic conservativeness of that trait at a lower hierarchical level (see the 
swallow example in the next section). 

Because of these problems, hypotheses of phylogenetic constraint may be ex- 
amined more effectively if restricted variation in a trait is associated with fluctua- 
tions in the selective environment (Edwards and Naeem, 1993; Ligon, 1993; Miles 
and Dunham, 1993). Edwards and Naeem (1993) proposed that cooperative breed- 
ing in Australian birds persists even though the various species occupy different 
ecological regimes. They note that the lack of a relationship between cooperative 
breeding and habitat refutes the hypothesis that cooperative breeding necessarily 
reflects responses to current environmental or ecological conditions. The rejection 
of this hypothesis is an important advance in understanding the evolution of coop- 
erative breeding. Of course, once an adaptive scenario is disproved, another may be 
erected in its place for testing. For example, in this instance, habitat may not be the 
primary selective force in the evolution of cooperative breeding; perhaps the force 
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is some other selective agent, such as the availability of mates (e.g., Pruett-Jones and 
Lewis, 1990). However, a careful consideration of the natural history of the study 
group should make it possible to limit the number of ad hoc scenarios that require 
testing. 

The first step in indentifying phylogenetic constraint is, thus, to show that the 
lack of phenotypic variation is more likely to be the result of ancestral genetic effects 
than selection. Eventually, we may be able to determine the genetic basis of particu- 
lar traits (through breeding experiments or molecular analyses) and mapping those 
genetic attributes onto phylogenies. The strength of the hypothesis then must be 
assessed by documenting significant selection differences among species. If phylo- 
genetic constraint is the likely explanation for stasis, the analytical focus should then 
shift to identifying the genetic mechanism of constraint. 


An Avian Example of Issues in Phylogenetic Constraint 


In Fig. 10.5, we present an example of difficulties encountered when arguing for 
phylogenetic constraint. Figure 10.5 presents an estimate of swallow intergeneric 
phylogeny (Sheldon and Winkler, 1993). When nest structure is mapped onto the 
tree and randomized data sets are compared (via the randomization function in 
MacClade; Maddison and Maddison, 1992), a strong phylogenetic effect for nest 
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FIGURE 10.5 A DNA hybridization estimate of the phylogeny of swallow genera presented as a 50% 
majority rule tree (Sheldon and Winkler, 1993; Winkler and Sheldon, 1993). New World endemic 
genera are marked with asterisks. Numbers indicate bootstrap branch support. Reprinted with permis- 
sion of the publisher. 
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type is indicated (Winkler and Sheldon, 1993). For example, all members of one 
speciose clade (Hirundo, sensu lato) use mud to build their nests. Members of its sis- 
ter group either dig their nests in sandy soil or adopt holes or niches in trees and 
cliffs. Despite this consistency, substantial variation exists within the major nest- 
construction themes. The mud nesters, for example, construct a variety of nest 
types, from simple cups (e.g., barn swallows, Hirundo rustica) to enclosed globes with 
entrance tunnels [e.g., cliff swallows (Petrochelidon pyrrhonota)]. The Australian tree 
martin (Petrochelidon nigricans) adopts a hole in a tree, but lines or dams it with mud. 

Are the swallows in these clades phylogenetically constrained, or does selection 
determine the structure of nests in the major clades? Most mud nesters live in sub- 
Saharan Africa, where tree holes are at a premium and mud is freely available. Most 
hole adopters live in the New World tropics, where mud nests may be less adaptive. 
Emlen (1954), for example, noted that mud nests of swallows can crumble in con- 
ditions of high humidity, even without being directly moistened. If habitat controls 
nest type, the basic nesting strategy (mud building, adoption, burrowing) may be 
considered a key adaptation (Winkler and Sheldon, 1993). If genetics controls nest 
type, then nesting strategy is constrained. 

The problem is that variations on the three themes could be simple modifica- 
tions within the confines of stabilizing selection, phylogenetic constraint, or a com- 
bination of the two (Winkler and Sheldon, 1994). Moreover, the outwardly simple 
pattern is confused by exceptions. The purple martin (Progne subis), for example, is 
undoubtedly a member of the large core martin clade (Fig. 10.5). It is a well-known 
hole adopter and a New World endemic (like all hole adopters in the core martin 
clade). Even so, purple martins are known occasionally to build open nests with 
mud walls (F. H. Sheldon and L. A. Whittingham, personal observation). Does this 
mean that all swallows have the genetic capacity to build mud nests, and it is only 
globally manifested in one major clade? Perhaps hole nesting is the phylogenetic 
constraint, and burrowing, adopting a hole, and building a mud nest with an en- 
trance tunnel are variations on that theme. The point is that any number of stories 
that invoke phylogenetic constraint can be formulated, and the demonstration of 
constraint in this instance will require a careful study of nest types in variable 
environments. 


IV. SUMMARY 


Historical ecology is burgeoning because it provides structure to the study of eco- 
logical patterns and evolutionary processes. However, in early efforts to apply his- 
torical methods, some ecologists and systematists have neglected the importance of 
accurate phylogenies to ecophylogenetic analysis, and others have been too hasty to 
invoke adaptation and phylogenetic constraint to explain patterns of character de- 
velopment. We have emphasized the use of accurate phylogenetic data because the 
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in taxa A and B. However, the pattern of trait (f) covariation with environmental change (e) suggests that 
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interpretation of evolutionary patterns obviously changes as relationships among 
taxa change. We have also emphasized the difficulty in demonstrating adaptation 
and constraint in the hope that more rigor will be applied to the study of these 
phenomena. In doing so, we have outlined approaches that investigators might take 
to examine trait evolution and interaction. Although progress will be limited with- 
out a knowledge of the quantitative genetics of specific traits, we think that initial 
hypothesis testing is possible with prudent use of the phylogenetic approach, pro- 
vided that multiple examples of potential evolutionary phenomena are examined. 
For example, a hypothesis of adaptation is strengthened substantially if it can be 
shown that the putative adaptation has evolved convergently in several distinct line- 
ages under highly similar environmental conditions (e.g., Pagel, 1994). For phylo- 
genetic constraint, alternative explanations for stasis should be ruled out by testing 
for trait constancy in a variety of selective regimes. 

An important benefit of the historical approach is that it may be used to generate 
hypotheses and focus future research. For example, we can identify traits that occur 
multiple times within a phylogeny, and thus may provide productive grist for quan- 
titative genetic studies. We can also predict the characteristics of unstudied taxa. 
Fig. 10.6 is a hypothetical example showing that trait t appears to be related to an 
environmental condition e. A and B may be unstudied taxa where the state of t is 
unknown, but their environment is known (e,). We can predict that A and B will 
have trait fa, if this trait evolves in response to e,. Phylogenies can also be used to 
test evolutionary models, such as sexual section models of female mate choice. For 
example, the sensory bias hypothesis predicts that female preferences for exagger- 
ated male secondary sexual traits evolved before the male trait (e.g., Basolo, 1990; 
Hill, 1994). This hypothesis can be tested by observing in the phylogeny the se- 
quence in which the male trait and the female preference evolved. In short, the 
predictive strength and usefulness of the historical approach are limited only by the 
availability of accurate ecological and phylogenetic data. 
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I. INTRODUCTION 


Patterns of geographic variation in morphological characteristics of organisms re- 
veal a variety of evolutionary processes ( James, 1970). External phenotypic char- 
acters of American robins (Turdus migratorius) tend to covary with geographic tem- 
perature—humidity gradients, illustrating the potential of local adaptation to effect 
geographic variation patterns (Aldrich and James, 1991; James, 1970). The diffi- 
culty of establishing the genetic basis of polygenic fitness traits (e.g., James, 1983), 
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and therefore the evolutionary interpretation of spatial patterns, has led investigators 
to document the geography of genetic variation by various indirect methods. These 
methods include protein electrophoresis, restriction fragment length polymorph- 
isms (RFLPs) in mitochondrial DNA (mtDNA), and most recently the direct se- 
quencing of DNA itself. Such methods are of interest because the way in which 
genetic variation is apportioned within and among populations reveals signatures 
left by various evolutionary processes. 

In this chapter I review ways in which molecular techniques have been applied 
to the study of geographic variation in birds. I concentrate on studies of mtDNA 
variation because this provides the largest database for inference. After a brief review 
of allozymic studies, which set the stage for subsequent work, I evaluate how studies 
of mtDNA have contributed to understanding the evolution of geographic varia- 
tion, population structure, and gene flow. A relatively new field, comparative phy- 
logeography, is presented as a way of testing causal mechanisms of geographic dif- 
ferentiation as well as the historical stability of community composition. 


П. ALLOZYMIC STUDIES OF AVIAN 
GEOGRAPHIC VARIATION 


Several authors (e.g., Barrowclough, 1983; Corbin, 1987; Evans, 1987; Barrow- 
clough and Johnson, 1988) reviewed allozymic studies of bird species, and they 
summarized resultant estimates of genetic variability. Although there were reports 
of natural selection influencing gene loci surveyed with electrophoresis (Redfield, 
1974; Gyllensten et al., 1979), most studies used allozymic variants as neutral ge- 
netic markers for estimating levels of genetic variability, and for investigating 
demographic structures and processes (Barrowclough et al., 1985). Data on hetero- 
zygosity suggested that avian populations generally had levels of genetic variation 
consistent with those observed in other vertebrates. Another genetic estimate of 
interest was the fraction of genetic variation distributed among populations. Some 
studies found significant allelic frequency differences (e.g., Johnson and Marten, 
1988), but in some of these cases comparisons might have been interspecific. In 
general the amount of genetic variation distributed among populations of temper- 
ate North American birds was low compared to many other vertebrates (Barrow- 
clough, 1983; Avise, 1983). In fact, levels of among-population differentiation 
seemed anomolously low. For example, Barrowclough (19802) described patterns 
of allelic variation in the yellow-rumped warbler (Dendroica coronata), and despite 
considerable geographic distance and morphological variation among population 
samples once ascribed to different biological species, no diagnostic or frequency 
differences were observed. Barrowclough concluded that population sizes are large, 
and connected by either ongoing or recently ceased gene flow. This view of popu- 
lation structure was the typical one inferred for North American birds from allo- 
zyme studies. The situation in the tropics might be different (Capparella, 1991). 
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Two hypotheses were offered to explain the general lack of population structure 
in birds, but were not resolved with allozymic data: avian populations have high 
effective population sizes and levels of gene flow (Barrowclough, 1980b), or avian 
molecular evolution proceeds at a slow rate relative to molecular evolution in other 
vertebrates, and to plumage evolution in the birds themselves. Avise (1983) sug- 
gested that a molecular rate slowdown might be due to high avian body tempera- 
tures that exert a stringent selective environment. A variant of the molecular slow- 
down explanation exists, which suggests that allozymic evolution proceeds at a 
slower pace than other avian genomic regions (Zink, 1991). For whatever reason(s), 
a problem in interpreting allozyme evidence is that currently segregating allozyme 
alleles probably have common allele ancestors that predated the fragmentation of 
populations and in some cases species (Zink and Remsen, 1986), and are therefore 
poor in information about recent population history. Mindell et al. (1996) discuss 
evidence for a rate slowdown at the DNA level, and suggest that relatively high 
avian body temperature might account for reduced differentiation (Kessler and 
Avise, 1985) via lower rates of change. 

The notion that avian populations were largely unstructured prompted James 
(1991) to suggest that strong selection at the morphological level must overcome 
high levels of gene flow. Needed was an independent set of molecular markers to 
test the nature of avian population structure and gene flow deduced from allozyme 
evidence. The DNA revolution provided a new set of markers that has captured the 
attention of researchers even more than allozyme electrophoresis (Lewontin, 1974). 


III. THE DNA REVOLUTION 
IN INTRASPECIFIC STUDIES 


Surveys of restriction fragment length polymorphisms in mtDNA rapidly sup- 
planted allozymic studies as a source of markers for studying both population-level 
and higher level systematic questions (Avise, 1994). The maternally inherited, non- 
recombining, rapidly evolving mtDNA genome is rich in information about popu- 
lation-level processes. An advantage of mtDNA RFLP (and sequence) data is that 
the variant individual patterns (haplotypes) can be analyzed phylogenetically. In this 
approach, one infers the phylogenetic history of haplotypes (effectively alleles at the 
haploid mtDNA “‘locus”) in the same way that one infers, for example, the phy- 
logeny of species in a genus. Haplotype phylogenies are superimposed over geog- 
raphy, an approach aptly termed phylogeography by Avise et al. (1987). At one ex- 
treme, all haplotypes at all localities would trace to single common ancestors, 
themselves geographically arrayed in the phylogenetic tree, which would signal a 
highly substructured population and would offer a hypothesis for the history of 
isolation events. The haplotypes themselves could be similar or different in percent- 
age sequence divergence, which would correspond to strong and weak phylogeo- 
graphic divisions. The “depth” of the structured haplotype trees could be an index 
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FIGURE 11.1 Geographic pattern of haplotype relationships in the common grackle. (From Zink 
et al., 19912.) Reprinted with permission of the publisher. 


to the age of the population fragmentation recorded by the mtDNA. At the other 
extreme, haplotypes would appear geographically “scrambled,” which would sug- 
gest either recent expansion of a species' range or high levels of current gene flow. 
Recent range expansions can result in haplotype phylogenies that are not geographi- 
cally structured, a process termed lineage sorting (Avise, 1994) or retained ancestral 
polymorphism. Geographically unstructured haplotypes could be similar or divergent, 
each of which would yield different interpretations. For instance, similar haplotypes 
not showing a geographic pattern might suggest a recent bottleneck followed by 
extensive gene flow. Alternatively, some (e.g., Rand et al., 1994) have proposed 
"selective sweeps” where one mtDNA haplotype is rapidly substituted because it is 
highly favored. Divergent haplotypes that are geographically unstructured might 
indicate previous allopatric divergence followed by a breakdown of barriers and 
population admixture, such as in snow geese (Chen caerulescens; Avise et al., 1990). 
These are four more or less extreme phylogeographic structures with a large variety 
of intermediate possibilities (Avise et al., 1987). Each reveals information about the 
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Haplotype - State 
1 NY, N/S-C, FL 











2 FL 
: 3 NY 
Atlantic Coast 4 NSC 
5 NY 
11 FL 
(1%) 6 LA, FL 
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9 FL 
Gulf Coast 
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7 LA 


FIGURE 11.2 Phylogenetic relationships among haplotypes in the seaside sparrow (from Avise and 
Nelson, 1989), indicating a lack of phylogeographic structure within the two major clades corresponding 
to Atlantic and Gulf coasts. Abbreviations are for states in the U.S., and “N/S-C” refers to either North 
or South Carolina, which could not be deduced from Avise and Nelson’s (1989) paper. Degree of hap- 
lotype divergence not indicated except for the distance between the two principal clades (1%). 


history of mtDNA lineages, which likely reflect the history of populations, even 
though the mtDNA genome is but a single “gene” embedded in the organismal 
phylogeny (Avise, 1994). 


IV. PHYLOGEOGRAPHIC STUDIES IN BIRDS 
A. Nature of Variation 


The first major study of avian mtDNA phylogeography (Ball et al., 1988) showed 
that a phenotypically variable species, the red-winged blackbird (Agelaius phoeni- 
ceus), was essentially unstructured geographically. Another species that lacked phy- 
logeographic structure was the common grackle (Quiscalus quiscula; Fig. 11.1), 
which was once considered two species. An alternative was the seaside sparrow 
(Ammodramus maritimus), which Avise and Nelson (1989) found consisted of two 
discrete parapatric geographic units (Fig. 11.2). 

I compiled studies of mtDNA differentiation (Table I) in a variety of North 
American birds to determine general correlates of differentiation. I ascertained 
whether populations sampled on either side of mountains, deserts, Beringia, those 
on islands, or those separated by long distances without barriers, tend to be differ- 
entiated (i.e., haplotypes at a locality form a clade); some species figured multiple 
times in the tests if they were relevant to more than one type of barrier. The data 
set was limited by the idiosyncratic nature of individual studies, and by the difficulty 
in defining barriers for vagile organisms such as birds. 


TABLEI Mitochondrial DNA Differentiation in North American Birds‘ 


Species 


Pipilo erythrophthalmus 
Geothlypis trichas 
Melospiza melodia 
Molothrus ater 
Zenaida macroura 
Picoides pubescens 
Agelaius phoeniceus 
Dendroica petechia 
Calidris alpina 
Agelaius phoeniceus 
Chen caerulescens 
Ammodramus caudacutus 
Spizella passerina 
Quiscalus quiscula 
Parus bicolor 
Parus wollweberi 
Ammodramus maritimus 

Gulf coast 

Atlantic coast 
Colaptes auratus 
Parus carolinensis 
Parus hudsonicus 
Parus atricapillus 
Tympanuchus sp. 
Branta canadensis 
Passerella iliaca“ 
Passerella iliaca iliaca 
Passerella iliaca megarhyncha 
Passerella iliaca unalaschcensis 
Passerella iliaca schistacea 
Branta bernicla 


Geothlypis trichas 
Passerella iliaca 
Picoides pubescens 
Amphispiza belli 
Agelaius phoeniceus 
Melospiza melodia 
Zenaida macroura 
Parus inornatus 
Spizella passerina 
Agelaius phoeniceus 
Molothrus ater 


Parus atricapillus 


Phalocrocorax pelagicus 


Ref. 
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Barrier 


Distance (Е/М) 
Distance (E/ W) 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance (E/ W) 
Distance (E/ W) 
Distance (N/S) 
Distance (E/ W) 
Distance (E/ W) 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance (E/ W) 
Distance 
Distance 
Distance 
Distance (N/S) 
Distance 
Distance 
Distance 
Distance 
Distance 
Distance 


Cascades, S. Nevada 
Cascades, S. Nevada 


Cascades, Rockies 
Sierra Nevada 
Cascades, Rockies 
Cascades, Rockies 
Cascades, Rockies 
Sierra Nevada 


Cascades, S. Nevada 
Cascades, S. Nevada 


Cascades, Rockies 
Sierra Nevada 
Cascades, Rockies 


Beringia 


Phylogeographic 
variation 


Method 


RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
Seq 

RFLP 
Seq 

RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 


Seq 

RFLP 
RFLP 
Seq 

RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 
RFLP 


RFLP 





(Continues ) 


TABLEI (Continued) 


Species 


Anas creca 
Gallinago gallinago 
Numenius phaeopus 
Larus canus 

Sterna hirundo 
Brachyramphus marmoratus 
Picoides tridactylus 
Hirundo rustica 
Pica pica 

Anthus spinoletta 
Calcarius lapponicus 
Leucosticte arctoa 
Calidris alpina 
Arenaria interpres 


Branta bernicla 
Dendroica petechia 
Coereba flaveola 
Melospiza melodia 
Passerella iliaca 
Saltator albicollis 
Geothlypis trichas 
Parus hudsonicus 
Parus atricapillus 


Parus gambeli 
Dendroica nigrescens 
Strix occidentalis 
Auriparus flaviceps 


Campylorhynchus brunneicapillus 


Polioptila melanura 
Toxostoma lecontei 
Toxostoma curvirostre 
Pipilo fuscus 

Passerculus sandwichensis 





Phylogeographic 
Веб“ Barrier variation Method 

6 Beringia N RFLP 

6 Beringia ху RFLP 

6 Beringia S RFLP 

6 Beringia S RFLP 

6 Beringia Ху RFLP 

6 Beringia 5 RFLP 

6 Beringia 5 RFLP 

6 Beringia Ww RFLP 

6 Beringia S RFLP 

6 Beringia S RFLP 

6 Beringia N RFLP 

6 Beringia S RFLP 

4 Beringia S Seq 

4 Beringia N Seq 

27 Island (N) S RFLP 

9 Island (S) S RFLP 

10 Island (S) S RFLP 

11 Island (N) N RFLP 

7 Island (М) Ху RFLP 

16 Island (S) S RFLP 

2 Island (N) N Seq 

24 Island (N) ху RFLP 

24 Island (N) W RFLP 

24 Desert S/S RFLP 

17 Desert S/S RFLP 

18 Desert S/S Seq 

19 Desert N Seq, RFLP 
19 Desert N Seq, RFLP 
19 Desert N Seq, RFLP 
19 Desert S/S Seq, RFLP 
19 Desert S/S Seq, RFLP 
19 Desert S/S Seq, RFLP 
20 Desert (Baja) W/S? RFLP 


“Abbreviations for level of phylogeographic variation: S, strongly differentiated; W, weak; М, none 


apparent. If S or W is indicated, the nature of variation may be coded as S (step clinal) or G (gradual). 

"References: (1) Ball and Avise (1992); (2) J. T. Klicka and R. M. Zink (unpublished data); (3) Ball 
et al. (1988); (4) Wenink et al. (1993); (5) Johnson and Cicero (1991); (6) Zink et al. (1995); (7) Zink 
(1994); (8) Quinn (1992); (9) Klein and Brown (1995); (10) Seutin et al. (1994); (11) Zink and Dittmann 
(19933); (12) Rising and Avise (1993); (13) Gill and Slikas (1992); (14) Zink and Dittmann (1993b); 
(15) Zink et al. (19913); (16) Seutin et al. (1993); (17) Bermingham et al. (1992); (18) Barrowclough et al. 
(personal communication); (19) R. M. Zink and R. C. Blackwell (unpublished data); (20) Zink et al. 
(1991b); (21) Avise and Nelson (1989); (22) Fleischer et al. (1991); (23) Moore et al. (1991); (24) Gill 
et al. (1993); (25) Ellsworth et al. (1994); (26) Van Wagner and Baker (1990); (27) Shields (1990). 


“Considering all four taxa conspecific. 
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1. Effects of Distance 


Isolation by distance results in differences between populations at opposite ends of 
a more or less continuous group of populations if dispersal distances are relatively 
low (Wright, 1978). It is difficult to test for this effect because "distance" is often 
confounded by barriers, either current or ancient, that could be associated with 
phylogeographic divisions in modern-day species. I identified studies in which 
samples were taken from at least 500 km apart, and for which the species’ habitat is 
more or less continuous without obvious current barriers to gene flow or major 
range disjunctions. Where phylogeographic differences were observed, I deter- 
mined whether they occurred over very short distances (e.g., step clines) or whether 
the variation was gradual over the region compared (i.e., consistent with isolation 
by distance). 

Most phylogeographic differences occurred over a geographically limited area 
(Table I). For example, in the fox sparrow (Passerella iliaca), there are four distinct 
groups of haplotypes that apparently are parapatrically distributed, and each group 
is relatively uniform intra-se (Zink, 1994). Overall data suggest that dispersal dis- 
tances are too large, or population expansions too recent, for isolation by distance 
to be a major factor in structuring avian populations in North America, and that 
where differences do exist, there appears to be evidence of a barrier other than 
distance. 


2. Effects of Mountains 


Populations distributed on either side of mountain barriers could be expected to 
diverge because gene flow is limited, or the mountain ranges mark boundaries be- 
tween different areas of endemism. My compilation suggests that mountain barriers 
could be a significant (6 of 13 species in Table I) cause of mtDNA differentiation. It 
is difficult to falsify the proposition that mountain ranges represent sites of second- 
ary contact. 


3. Effects of Deserts 


Deserts can act as barriers to nondesert species, and intervening habitats can isolate 
taxa in different deserts (Hubbard, 1973). Species distributed in different deserts of 
southwestern North America show a variety of levels and patterns of differentiation. 
Canyon towhee (Pipilo fuscus) and curve-billed thrasher (Toxostoma curvirostre) show 
considerable mtDNA RFLP and sequence differentiation across the Sonoran and 
Chihuahuan deserts, whereas several other species [verdin (Auriparus flaviceps), cac- 
tus wren (Campylorhynchus brunneicapillus), black-tailed gnatcatcher (Polioptila melan- 
иға)| appear undifferentiated. However, in southern Baja California Sur, prelimi- 
nary data suggest that the verdin and cactus wren are significantly differentiated 
(R. M. Zink, unpublished data), a pattern found in other vertebrates (Murphy, 
1983). A subspecies of Le Conte’s thrasher (T. lecontei arenicola) isolated along the 
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west coast of central Baja California is strongly differentiated (Zink et al., 1997) 
whereas populations of California gnatcatcher (Polioptila californica) are at most 
weakly differentiated throughout Baja California (R. M. Zink, G. F. Barrowclough, 
R. C. Blackwell, and J. L. Atwood, unpublished data). Thus, species sampled in the 
aridlands of North America exhibit a mixture of phylogeographic patterns. 


4. Effects of Islands 


Although the island of Newfoundland shows evidence of having populations ge- 
netically differentiated from adjoining continental ones, other island populations 
did not show differentiation. For example, neither fox nor song sparrows were dif- 
ferentiated in restriction sites on the Queen Charlotte Islands or Vancouver Island, 
relative to adjacent mainland populations. This suggests that these large islands re- 
cently were separated from the mainland, were colonized relatively recently, or re- 
ceive immigrants at a rate preventing differentiation. Too few populations have 
been sampled to make firm conclusions about islands as isolating factors. 


5. Effects of Beringia 


Zink et al. (1995) compared small samples of species found on either side of Berin- 
gia. Of 13 species they studied, all but 2 showed evidence of mtDNA RFLP dif- 
ferentiation, with 4 species showing weak and 7 species strong differentiation. 
However, the relationship between morphological and mtDNA differentiation was 
inconsistent. Populations of three-toed woodpecker (Picoides tridactylus) exhibit 
little morphological difference between the two continents, yet they were very dif- 
ferent in mtDNA, suggestive of species status. Other species, such as the marbled 
murrelet (Brachyramphus marmoratus), show both mtDNA and morphological dif- 
ferentiation. Overall, most species showed mtDNA differentiation to some degree, 
consistent with geographic isolation on different continents. 


6. Summary 


Contrary to allozyme studies, almost 4096 of species examined exhibited geographic 
variation in mtDNA (Table I). Most occurrences of significant phylogeographic 
structure have been found in species that also exhibit morphological differentiation; 
often mtDNA and subspecies boundaries are congruent, whereas this was not the 
case with allozymic studies. The resultant question is whether there are consistent 
geographic correlates of structured haplotype trees. Although past barriers are dif- 
ficult to judge from current conditions, most mtDNA phylogeographic structure 
seems associated with a barrier other than distance, although most general barriers 
appear associated with phylogeographic structure. However, the existence of an 
apparent barrier or a named subspecies is not a general predictor of mtDNA differ- 
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entiation. That is, more subspecies lack mtDNA differentiation than exhibit it (Ball 
and Avise, 1992). Phylogenetic analysis of mtDNA haplotypes seems better able to 
document patterns of population differentiation than allozyme data. In summary, a 
new perspective on avian populations is emerging, recognizing more structure than 
suggested by allozyme analysis. 


B. Haplotype Phylogenies and Directionality 


If a haplotype phylogeny were rooted with an outgroup haplotype (e.g., Maddison 
et al., 1984), or from coalescence theory (Crandall and Templeton, 1993), one 
might infer the historical direction of colonization by superimposing the haplotype 
tree on a map, starting from the basal haplotype and moving up the tree, revealing 
a directional geographic progression (Avise et al., 1983; Lansman et al., 1983). How- 
ever, if a species has been evolving for a significant period, “dispersing” haplotypes 
will reach geographic range boundaries and be" deflected" backward, erasing the 
monotonic relationship between geographic and genetic distance. Equilibrium is 
often a signal of the latter phenomenon (Neigel and Avise, 1993). 

Most phylogeographic studies did not include a sister species, resulting in an 
(unrooted) network. A rooted tree for the common grackle (Fig. 11.1) did not 
exhibit a directionality, potentially owing to high levels of gene flow after the range 
was fully occupied (Moore and Dolbeer, 1989). A rooted haplotype tree for the 
song sparrow (Zink and Dittmann, 19932) suggested that basal haplotypes occurred 
in Newfoundland, and the more derived haplotypes were found in the northern 
part of the range. This is consistent with a relatively recent northward spread of 
song sparrows from an eastern refuge following retreat of glaciers. Several caveats 
render this only a hypothesis (Zink and Dittmann, 19932). However, studies of 
other species point to basal haplotypes also occurring in Newfoundland (Zink, 
1994; Gill et al., 1993). Further research is required to determine whether New- 
foundland, or nearby sites today submerged (see Pielou, 1991), were in fact a refuge 
for other species. 


C. Estimates of Gene Flow 


An expectation for molecular markers is that they will provide indirect measures 
of gene flow. Slatkin and Maddison (1989) suggest that coalescent events between 
haplotypes that are currently in different populations are evidence of recent gene 
flow. In the common grackle, sister haplotypes often occurred in different popula- 
tion samples (Fig. 11.1) suggesting gene flow. In general, RFLP data suggest consid- 
erable gene flow. However, in many RFLP data sets, Slatkin and Maddison’s (1989) 
method seems inappropriate as individual haplotypes are often found in multiple 
populations, which could be evidence of gene flow or simply ancestral retentions 
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and nonequilibrium conditions (see Edwards, 1993). Direct sequencing might re- 
solve more haplotypes, which could be useful for gene flow calculations. Edwards 
(1993) sequenced individual babblers, and the haplotype phylogeny provided evi- 
dence of long-distance gene flow among populations. 

Neigel et al. (1991) and Neigel and Avise (1993) use information on haplotype 
divergence, generation time, rate of molecular evolution, and the geographic oc- 
currence of haplotypes to infer single generation dispersal distances. Importantly, 
their method does not require genetic or demographic equilibrium, conditions un- 
likely to hold for North American birds [and an assumption required for Slatkin 
and Maddison's (1989) method]. Dispersal distances for birds tend to be higher 
(73.0 km/generation) than those for rodents («0.5 km/generation) (Neigel and 
Avise, 1993; Zink, 19962), consistent with interpretations derived from allozyme 
and RFLP data (Zink and Remsen, 1986). Estimates of dispersal distance differ from 
the typical measure of gene flow derived from allozymic data (or any frequency- 
based method), Nm, the average number of immigrants exchanged among demes 
per generation. Nm values are not easily compared to the single-generation dispersal 
distance. Because of the restrictive assumptions associated with calculations of Nm 
from DNA data, single-generation dispersal distances might be a better measure 
to compute and compare among avian populations. Such data also permit direct 
comparison with mark-recapture studies. Unfortunately, there are relatively few 
mtDNA dispersal distances estimated at this time. Nevertheless, mtDNA data do 
suggest that gene flow is high enough to prevent differentiation over distance. 


D. Description and Significance of Genetic 
Variation among Populations 


An important step in the evolutionary process is the conversion of genetic variation 
from within to among populations. With allozyme electrophoresis, measures such 
as E, and G, were computed, which estimate that proportion of genetic variance 
distributed among populations (Wright, 1978). The degree of population structure 
is taken to reflect degree of geographic isolation. Considering haplotypes as alleles 
at a locus (the mtDNA genome), опе can compute Е, and С, for mtDNA data. 
The analysis that seems best suited for this is N, (Lynch and Crease, 1990); unfor- 
tunately, available computer programs require raw data that is often not included in 
published phylogeographic studies. 

С, or F, values calculated from haplotype data without corrections for small 
sample size or sequence divergence between haplotypes, or consideration of phylo- 
geography, can be misleading. For a distribution of haplotype frequencies among 
populations, the single possible G„/F« value is associated with multiple haplotype 
trees (see Felsenstein, 1978; for example, with only 9 haplotypes, there are 2,027,025 
possible rooted trees). Thus, if a G, value is significant, but the haplotype tree is 
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TABLEII С, Values for Avian Species 





No clear phylogeographic structure Clear phylogeographic structure 
Species Re£^ G, value Species Ref.  G, value 

Song sparrow 1 0.09 Fox sparrow 6 0.26* 
Chipping sparrow 2 0 Seaside sparrow 7 0.38% 
Red-winged blackbird 3 0.11 Rufous-sided towhee 5 0.45* 

Common grackle 4 0.06 Common yellowthroat 5 0.22 

Downy woodpecker 5 0 Black-capped chickadee 8 0.19 
Brown-headed cowbird 5 0.27 Boreal chickadee 8 0.57* 
Mean: 0.09 Carolina chickadee 8 0.25* 
Sharp-tailed sparrow 9 0.27* 

Mean: 0.32 


"*No clear phylogeographic structure” means that haplotypes at a locality (or within a region) are not 
generally each others nearest relatives, whereas the alternative suggests the opposite. An asterisk indicates 
that the value was significant at the p = 0.05 level. 

"References: (1) Zink and Dittmann (19933); (2) Zink and Dittmann (1993b); (3) Ball et al. (1988); 
(4) Zink et al. (19912); (5) Ball and Avise (1992); (6) Zink (1994); (7) Avise and Nelson (1989); (8) Gill 
et al. (1993); (9) Rising and Avise (1993). 


unstructured, the investigator needs to consider which "message" to believe. I sug- 
gest that the haplotype tree be accorded primary significance if the two methods 
suggest different pictures of population history. 

I calculated С, as (Н, — H,)/H,, where Н, is the total gene diversity and Н, is 
the weighted average of within-population gene diversity; no correction was made 
for sequence divergence. H, and H, are calculated with a correction for sample size. 
The significance of the С, is determined by calculating the probability that the 
observed G, is significantly different from that obtained by randomly reallocating 
haplotypes among populations. The randomization is done by reallocating haplo- 
types among populations (keeping the original sample sizes) and recalculating С... 
The P value is obtained by dividing the number of times that the recalculated G,, 
is equal to or larger than the observed one by the total number of permutations. A 
P value of less than 0.05 indicates significant genetic structure. 

I found 14 avian phylogeographic studies in North America from which G, 
could be computed (Table II), and I divided these studies into those with and with- 
out phylogeographic structure. For the former, the average G, was 0.32 + 0.13 
(SD; и = 8) and for the latter it was 0.09 + 0.11 (SD; n = 6). Clearly, there can be 
biases introduced by overly rapid mutation rates (nonequilibrium), and differences 
in effective population size between organellar (e.g., mitochondrial) and nuclear 
genes, but I feel that comparisons of the distribution of G, and F, values derived 
from mtDNA and allozyme data are of interest. Although there is variation caused 
by differing sample sizes, population samples, and geographic area covered, the 
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FIGURE 11.3 Distribution of G, values from mtDNA data and F, values from allozyme data. The 
single F, value greater than 0.6 was based on a single locus (Barrowclough and Gutiérrez, 1990). Data 
used in computation of F, distribution are available from the author. 


mtDNA data suggest a broader range of population differentiation relative to that 
for allozyme studies (Fig. 11.3). Thirty-three of 38 (87%) allozyme F, values are less 
than 0.10, whereas the comparable figure for mtDNA surveys was 4 of 14 (29%). 

One might ask, therefore, what is to be made of the many allozymic studies of 
avian population structure in North America (Fig. 11.3). Clearly, those reporting 
no structure need to be corroborated by other molecular markers, because it seems 
likely that allozyme variation might not record recent population fragmentation 
events (Zink, 1991). Conclusions about high rates of gene flow, natural selection, 
or large effective population sizes also need to be interpreted cautiously. Degree of 
population structuring might indicate speciation potential, with a highly structured 
species being closer to the speciation boundary (Templeton, 1980). If true, allozyme 
data suggest relatively low speciation potential in birds, relative to groups such as 
rodents and salamanders. However, there are more species of birds in the world than 
rodents or salamanders, which further casts doubt on the validity of population 
structure inferences from allozyme data. Speciation in birds might be rapid owing 
to sexual selection (Zink, 1996b). 
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V. COMPARATIVE PHYLOGEOGRAPHY 
A. Method 


Haplotype phylogenies have a variety of uses, such as in conservation biology and 
mating system studies (Avise, 1994). Zink (1996a) reviewed a potential parallel be- 
tween historical (vicariance) biogeography (Cracraft, 1982; Wiley, 1988) and phy- 
logeography. In vicariance biogeography one examines the phylogenetic patterns 
among species in lineages distributed over the same areas of endemism. The most 
parsimonious explanation for congruent patterns is that the component lineages 
were historically widespread and codistributed (or “broadly sympatric”), and that 
they responded to the same set of vicariance events. Lack of congruence can result 
from dispersal across barriers, differential response to barriers, or lack of long-term 
sympatry in ancestral biotas (Zink and Hackett, 1988). On a more recent time scale, 
one can ask whether phylogeographic patterns among currently codistributed spe- 
cies are congruent, an endeavor that could be termed comparative phylogeography. If 
so, it would suggest that species’ phylogeographic patterns were shaped by com- 
mon responses to unique historical events. This interpretation assumes that the 
variation is selectively neutral, and that gene flow, mutation, and genetic drift pro- 
duce phylogeographic patterns. Congruent phylogeographies suggest that the spe- 
cies composition of communities has been reasonably stable. If species turnover 
in communities is relatively frequent, congruent phylogeographic patterns would 
not be pervasive because, presumably, species would not be broadly sympatric for 
sufficient periods of time. Some paleoecologists predict that "communities have 
broken up and reformed in different configurations repeatedly and regularly on 
time scales of a few thousand years" (Bennett, 1990), suggesting that phylogeo- 
graphic congruence might be the exception, not the rule. Thus, comparative phy- 
logeography offers a perspective on factors producing geographic variation, and on 
the stability of particular associations of species. 

Avise (1992) found that a diverse group of species, including freshwater fish, 
marine fish, marine invertebrates, and a terrestrial bird (seaside sparrow) showed 
a significant phylogeographic break in northeastern Florida. Because species in dif- 
ferent taxonomic classes showed evidence of this division, it seems clear that ances- 
tral species were historically codistributed and subsequently isolated in common; it 
is unlikely that such diverse species responded in common to a selective gradient. 
Comparison of taxonomically diverse species adds strength to the historical isola- 
tion interpretation. Thus, if one is inclined not to compare a marine fish and a 
terrestrial bird because they are “not comparable" the strength and validity of the 
comparison is tenuous. Choice of species compared should not be constrained 
a priori (Simberloff, 1987). If only species in a single genus were compared, one 
might suspect that the species were prone to respond to some selective gradient 
because of common phylogenetic background. Thus, the protocol for compara- 
tive phylogeography is to falsify the hypothesis that species currently codistributed 
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(sympatric but not necessarily syntopic) over a broad area show congruent haplo- 
type phylogenies. 

Outcomes of phylogeographic comparisons include congruent patterns, lack of 
congruence, markedly incongruent phylogeographic patterns, or mixture of pat- 
tern(s) and no pattern. Lack of congruence and its causes are just as instructive as 
congruent patterns (Lamb et al., 1989, 1992). If two species that are today broadly 
sympatric have phylogeographic patterns that differ, one might infer that they re- 
sponded to different historical events (i.e., what was a barrier to one species was not 
a barrier to another). Two currently codistributed species that differ because one 
has discernable phylogeographic structure and the other does not could be a result 
of not sharing a long history of coassociation. The nature of mtDNA haplotype data 
suggests ways to distinguish alternative hypotheses for lack of phylogeographic con- 
gruence (Zink, 19962). 


B. Some Avian Results 


The data in Table I serve as a basis for asking whether currently codistributed species 
of North American birds have similar phylogeographic patterns. Unfortunately, 
only species that have been surveyed over the same broad area with similar molecu- 
lar methods yield useful comparisons, and there are relatively few such studies. Zink 
(19962) compared five species that are currently widespread over the same conti- 
nental area and were subjected to similar mtDNA RFLP analyses (Fig. 11.4). The 
fox sparrow has four geographically structured groups of haplotypes that can be 
traced to four common ancestors (Zink, 1994). The phenotypically variable song 
sparrow (Melospiza melodia) exhibits considerable haplotype variation, which sur- 
prisingly is geographically unstructured (Zink and Dittmann, 19932). The chipping 
sparrow (Spizella passerina; Zink and Dittmann, 1993b) and red-winged blackbird 
(Ball et al., 1988) exhibit little if any phylogeographic structure, yet the latter spe- 
cies has many morphological subspecies. Last, the Canada goose (Branta canadensis) 
exhibits a relatively deep division into two haplotype groups (Van Wagner and 
Baker, 1990). 

The five species do not have congruent phylogeographies (Fig. 11.4). The fox 
sparrow and Canada goose have relatively deeply structured phylogeographic trees 
that are not congruent with each other, whereas the other species show no clear 
pattern of differentiation. One can, therefore, falsify the hypothesis that each species 
was historically codistributed and responded similarly to historical isolating events. 
The question becomes, why do these five species have different phylogeographic 
structures? The species might have had different dispersal characteristics that re- 
sulted in different responses to common barriers. The species might differ in levels 
of genetic variability, such that species without variation could not show differen- 
tiation because they lack the “raw materials.” Species without phylogeographic 
patterns might be recently evolved, such that insufficient time has elapsed for 
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FIGURE 11.4 Approximate breeding distributions and diagrammatic haplotype phylogenies for five 
species of North American birds (from Zink, 1996a). If mtDNA evolution proceeds at a reasonably 
uniform rate, it is significant that the splits in the Canada goose and fox sparrow occurred at 2.5% 
sequence divergence or less, whereas each of the species compared is more than 2.5% distant from its 
nearest congener; hence, each species was extant during the times when isolating events fragmented 
populations of fox sparrow and Canada goose. Reproduced with permission of the publisher. 
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differentiation. Lack of pattern might occur in species with high levels of gene flow. 
Perhaps the five species were simply not historically codistributed. 

Zink (19962) discussed these alternatives. Each species was relatively well sampled 
with similar numbers of restriction endonucleases, which revealed similar amounts 
of mtDNA variability. Hence, incomplete sampling and lack of genetic variation 
were deemed implausible reasons for differing phylogeographic patterns. Mito- 
chondrial DNA data suggest significant gene flow within species that are unstruc- 
tured, or within the phylogeographic units of the fox sparrow and Canada goose. 
Thus, gene flow is probably too high to allow differentiation in the absence of 
geographic isolating barriers. 

The mtDNA genetic distance of a species from its most closely related extant 
congener can be used as a relative measure of the length of time a species has been 
evolving independently. We might predict that species isolated for the longest time 
would show mtDNA differentiation because of a clocklike accumulation of genetic 
differences. The fox sparrow and Canada goose are the most differentiated from 
their extant sister taxa (Zink and Blackwell, 1996; Shields and Wilson, 1987), and 
these show phylogeographic structure. Zink (19962) concluded that there was а 
relationship between elapsed time since sharing a common ancestor and phylogeo- 
graphic structure, which could result because species that have been evolving in- 
dependently longer have had a heightened probability of vicariant fragmentation. 
However, this does not mean that relatively old species were historically codistrib- 
uted, as suggested by lack of congruence between the fox sparrow and Canada 
goose. Also, this relationship is not straightforward in explaining levels of pheno- 
typic differentiation. For example, the chipping sparrow and song sparrow are each 
similarly distant from their sister taxa, yet the former is divided into 7 subspecies 
and the latter into 34. Even allowing for idiosyncracies in the way in which taxono- 
mists delimit subspecies, the amount of phenotypic variation in species of the same 
relative “molecular age” is strikingly different. Therefore, one is left with the pos- 
sibilities that phenotypic or mtDNA evolution is rate variable, or much of subspe- 
cific differentiation is ecophenotypic. 

The mtDNA phylogeographic structures evident in these five species led to sev- 
eral conclusions. First, historical isolating events, rather than isolation by distance, 
are most consistent with the phylogeographic structure in the fox sparrow and Can- 
ada goose; in both, phylogeographic breaks occurred over short distances. The rea- 
son for the lack of phylogeographic structure in the chipping sparrow, song sparrow, 
and red-winged blackbird likely is that they have only recently colonized the cur- 
rent range (see below). The relatively deep splits in the haplotype trees of the fox 
sparrow and Canada goose probably occurred before the three currently undiffer- 
entiated species (in mtDNA) achieved their present ranges and high degree of sym- 
patry. Obviously a larger sample of codistributed species is needed. Data suggest 
that one might not find much phylogeographic congruence among North Ameri- 
can birds; perhaps this is because of the recent deglaciations and habitat displace- 
ments (Pielou, 1991) and their effects on bird distributions. 
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C. Mismatch Distributions and 
Population Histories 


Central to my argument is the notion that present distributions are poor indicators 
of historical ones, especially over the times required for phylogeographic congru- 
ence to evolve (Chesser and Zink, 1994). Past population increases are probably 
likely to accompany major range expansions, and the former can be inferred from 
haplotype data (Avise et al., 1988; Rogers and Harpending, 1992; Rogers, 1995, 
1996). For example, a dramatic (500-fold) and sudden increase in population size 
results in a haplotype tree that is relatively unstructured, with most haplotypes trac- 
ing to a common ancestor just prior to the population expansion (a "bush" to- 
pology for the haplotype tree). Also, the distribution of pairwise restriction site 
differences between haplotypes, termed by Rogers and Harpending (1992) the mis- 
match distribution, exhibits a “wave” if there was a significant population expansion 
in an unstructured population. The wave is centered near the point of population 
expansion on the mutation distance scale used (Fig. 11.5). The method provides 
information about upper (0,) and lower (0,) bounds on population increases, 
and the relative timing of the increases in units of mutational distance (7). If the 
above inferences about phylogeographic histories were correct, such waves should 
be evident in the data for the red-winged blackbird, chipping sparrow, and song 
sparrow. 

The mismatch distributions (Fig. 11.5) show waves suggestive of population in- 
creases for the song sparrow and red-winged blackbird, but not for the chipping 
sparrow or fox sparrow (data for Canada goose unavailable). Avise et al. (1988) used 
a different method to suggest an increase in population size of the red-winged 
blackbird. Population increases in the song sparrow and red-winged blackbird are 
consistent with recent range expansions. The lack of such an expansion in the chip- 
ping sparrow could mean that although the species might have recently expanded 
its range, it did not have a low population size wherever it resided during Pleisto- 
cene glacial maxima. Because the fox sparrow 1s highly structured, the mismatch 
distribution shows two peaks (corresponding to within and among-group differ- 
ences). The mismatch distribution (not shown) for the common grackle similarly 
indicated a population increase, a conclusion also reached by considering the hap- 
lotype tree (Fig. 11.1) and pairwise distribution of interindividual haplotype dis- 
tances (Avise et al., 1988). 

The use of the mismatch distribution aids phylogeographic studies in revealing 
population expansions. Such inferences, not without caveats (Rogers, 1996), 
can help determine whether species that lack phylogeographic patterns only 
recently expanded their ranges. As one might predict for North American birds that 
colonized recently deglaciated areas, waves in mismatch distributions are apparent. 


























FIGURE 11.5 Mismatch distributions for four avian species. For each species, open circles indicate 
the mismatch distribution derived from the original matrix of restriction sites. F, indicates the frequency 
of each class, and і represents the number of restriction sites separating pairs of haplotypes. The solid line 
is the theoretical distribution fit using Eqs. (2) and (3) of Rogers (1996), and the dashed line shows the 
fit of the three-parameter model (see Rogers, 1996). For the song sparrow (upper left), the mismatch 
distribution shows a peak at 3/(2u) generations ago, where u is the aggregate mutation rate of the region 
of DNA under study. The parameter estimates suggest that the population was initially very small (0, = 
0), that the postexpansion population was moderately large (0, — 18.13), and that the expansion oc- 
curred 3.25/(2u) generations ago. The mismatch distribution for the red-winged blackbird (upper right) 
is consistent with a recent population expansion within 4 units of mutational time from present, starting 
from a relatively small population [alternatively, the mutation rate could be low (Rogers, 1996), but there 
is no evidence of this]. Mismatch distribution for the chipping sparrow (bottom left) does not have a 
clear wave, and is not consistent with a population expansion; the mismatch distribution is similar to that 
expected for the theoretical distribution ofa population at equilibrium between mutation and drift. The 
mismatch distribution for the fox sparrow (bottom right) shows two distinct peaks, one at zero and one 
at 17. Such a distribution can reflect (1) a history with two bottlenecks in population size, or 
(2) geographic structure (A. R. Rogers, personal communication); Zink (1994) found clear evidence of 
geographic structure. 
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D. Comparative Phylogeography 
and the History of Communities 


Comparative phylogeography provides a means to evaluate the stability of com- 
munity structure and coevolutionary models. An implication of phylogeographic 
incongruence is lack of historical continuity in community membership (Bennett, 
1990), which could influence the likelihood of certain types of coevolution. For 
example, Rothstein (1990) suggests that over long periods of time hosts of parasitic 
cowbirds evolve defenses and force the parasites to become specialists (see Lanyon, 
1992, for an alternative view). If phylogeographic studies suggest lack of significant 
historical association of cowbirds and their parasites, host-rejection behavior might 
be inferred to be very rapid, a plausible hypothesis if parasitism exerts a strong se- 
lective force (Rothstein, 1990). Hypotheses of coevolution that assume sympatry 
(and syntopy) on an evolutionary timetable could be coupled with comparative 
phylogeographic studies. Congruent phylogeographies provide evidence for spe- 
cies' long-term associations, and using molecular characters these might be placed 
in a temporal scale. 


VI. PROSPECTUS 


Detecting genetic variation, documenting its geographic deployment, and inferring 
evolutionary processes are central themes of evolutionary analysis of populations. 
New methods of molecular analysis successively provide greater resolving power. 
Direct sequencing of DNA stands to be the next most influential method to pro- 
vide insight into the nature of avian population structure. Although patterns of ge- 
netic variation at microsatellite loci (e.g., McDonald and Potts, 1994) can provide 
fine-scale resolution, the difficulty of phylogenetic interpretation of the data could 
limit use of the technique for some evolutionary analyses. Inferences from haplo- 
type phylogenies based on coalescence theory provide powerful analytical tools 
(e.g., Slatkin and Maddison, 1989; Crandall and Templeton, 1993) 

Although the need is often mentioned for nuclear gene assays, problems accom- 
pany nuclear gene analysis. Because of the four times greater effective population 
size of nuclear genes, the time to coalescence of nuclear gene alleles greatly exceeds 
that for mitochondrial genes (therefore, rapid, nonrecombining nuclear genes are 
needed). Although population structure can mitigate this relationship, nuclear genes 
on average will have a reduced probability of capturing population-level fragmen- 
tation relative to a well-resolved mtDNA gene tree (Moore, 1995). Nonetheless, at 
this writing, it is difficult to predict the future of mtDNA and nuclear DNA analy- 
ses; selective sweeps in mtDNA (Rand et al., 1994) could hinder population infer- 
ences drawn from mtDNA data. 

I suggest that studies of geographic variation will have greatest value when they 
encompass a large area, permitting comparisons of codistributed species. The num- 
bers of individuals per locale need not be as high as they were for allozymes because 
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the estimation procedures are haplotype and phylogeny based, not allelic frequency 
based. Roots for intraspecific haplotype phylogenies will also be required. These 
aspects will further the cross-enlightenment of population genetics and systematics 
(Avise et al., 1987). 
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I. INTRODUCTION 


Tropical forest ecosystems are credited with containing unparalleled biodiversity, 
but unfortunately are increasingly threatened by human activities. Up until now 
most conservation priorities have been modeled around present patterns of species 
richness and/or endemism. However, from an evolutionary perspective, only by 
understanding the processes that cause these patterns can informed and comprehen- 
sive conservation policies be formulated. Many theories have been postulated that 
link historic processes to the ecological patterns evident today (paleogeography hy- 
pothesis, gradient hypothesis, river hypothesis, river refuge hypothesis, and refuge hypothesis), 
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but rather than clarify, these theories have tended to increase the controversiality of 
the subject (see reviews by Haffer, 1997, and Tuomisto and Ruokalainen, 1997). 

The difficulty lies in disentangling today's patterns of species distributions from 
the historic processes that caused them. For example, replacement of closely related 
species across a geographic barrier or adjacent ecoregions may lead one to conclude 
that the present physical or environmental barrier caused the initial speciation event 
(Endler, 1977). However, such replacements may be secondary since sharp sutures 
between previously reproductively incompatible species are most easily maintained 
where there is a physical barrier or gradient. This has been exemplified by DNA 
studies of Andean rodents (Patton and Smith, 1992) and birds (Arctander and 
Fjeldsá, 1994). These authors showed that species that display present-day parapa- 
tric altitudinal replacement on mountain slopes actually originated from different 
mountain ranges, suggesting that parapatry is secondary. In addition, Salo (1988) 
had associated habitat complexity in fluvial plains in the upper Amazon area with 
speciation, but later suggested (]. Salo, personal communication) that dynamic 
habitat mosaicism is more likely to be a diversity-maintaining process and not the 
cause of the initial speciation process. 

To separate initial speciation events from the process of species redistribution 
and accumulation in areas of high current carrying capacity, we examined macro- 
scale patterns of avian species richness and related this to macroscale patterns of 
diversification in tropical forests. Our aim was to find out where recent speciation 
has been most intensive, and where old lineages (which only show phyletic specia- 
tion) predominate. We then used this pattern to formulate a model hypothesis for 
continental-scale speciation and redistribution, which was then used as a guideline 
for identifying case studies for detailed evaluation using DNA sequence data. 

This study compared data from South America and Africa, first because the 
marked difference in species richness indicates more intensive diversification in the 
former, and second because speciation patterns can be compared between the al- 
most uninterrupted band of montane forest along the eastern slope of the tropi- 
cal Andes region with the chains of mutually isolated “montane forest islands" in 
Africa. 

We review here the development of the model hypothesis, where biogeographic 
information was combined with the DNA-DNA hybridization data of Sibley and 
Ahlquist (1990) (see Fjeldsà, 1994, for details). Thereafter we present preliminary 
results of ongoing studies of smaller species groups, designed to evaluate different 
predictions of the model hypothesis. 


П. THE MODEL HYPOTHESIS 
A. Basic Assumptions 


Regions of intensive speciation are likely to be characterized by a high proportion 
of species representing recent phylogenetic events relative to old lineages with little 
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or no recent speciation. In addition, local aggregates of relictual species, which 
have undergone a severe contraction from large parts of the initial range of its line- 
age, indicate places that have remained ecologically stable during climatic fluctua- 
tions throughout the Quaternary (Fjeldsà, 1995; Fjeldsà and Lovett, 1997). To en- 
ter a phylogenetic framework and an approximate time dimension into our study, 
we used the results of DNA-DNA hybridizations by Sibley and Ahlquist (1990) 
(Fjeldsa 1992, 1994). These assumptions provide us with the possibility of temporal 
and spatial systematic comparisons, of regions of species richness with regions of 
paleoecological stability. 


B. Method 


One advantage of Sibley—Ahlquist phylogenetic data for this study is that a uniform 
technology was used to measure the time dimension across all groups of birds. The 
data set covers the global avifauna remarkably well, with altogether 1700 species 
studied, the data gaps consisting mostly of terminal branches of the phylogeny. Cer- 
tain taxa are underrepresented but the data set does not appear to be biased toward 
any particular geographical region. 

The molecular technique is based on thermostability of heteroduplex DNA. All 
single-copy, nuclear genes of two species are hybridized, after one of the single- 
copy, nuclear genomes has been radio labeled to serve as a tracer. The method 
measures the melting curve of the single-stranded tracer DNA, taking the midpoint 
(Т.Н) as a measure of the overall genetic divergence. Attempts to calibrate the 
molecular clock suggest a tentative rate A1.0 = 2.3 million years (MY) (Sibley and 
Ahlquist, 1990, p. 703), but the relationship between DNA distance values and 
time of divergence is not simple (Sibley and Ahlquist, 1990; see, e.g., p. 400 and 
Fig. 99) and the divergence may be retarded considerably in large birds with delayed 
sexual maturation. The relationships among species are determined by UPGMA 
clustering. 

Despite comprehensive methodological control, including studies of generation 
time effects, a multitude of technical and molecular aspects cause doubts about the 
power of resolution claimed by the authors (see O'Hara, 1991; Mindell, 1992; Sib- 
ley et al., 1993; Harshmann, 1994). Sibley and Ahlquist have taken the unusual 
approach to “correct” for different rates of DNA evolution rather than using an 
algorithm that does not assume rate constancy. It is also possible that the UPGMA 
clustering compresses the deep parts of the phylogeny, and that lineage-specific 
features such as biased base composition in some lineages could influence the ap- 
parent depth of nodes and their sequence, for example in the "explosive" radiation 
of passerine birds families in the mid- Tertiary. 

However, since our only use of the Sibley—Ahlquist phylogenies was to define 
two markedly different categories of species (deep branches and recent radiations 
with many species), the effect of the above-mentioned inaccuracies is likely to be 
small. For this reason and despite reservations about the precise sequence and depths 
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of nodes, we accepted the trees given by Sibley and Ahlquist as being representative 
of avian diversification. 

The Sibley-Ahlquist data provide relative timings of early radiations of most 
groups, although there is a lack of detailed reconstruction of top branches causing 
unresolved polytomies. This was corrected by rigorously determining which spe- 
cies are included within specific nodes on the Sibley—Ahlquist trees and which 
are not, using all published systematic revisions and phylogenetic hypotheses (the 
source list is too large to be included here) and personal judgments, where the 
Sibley-Ahlquist phylogenies provide a basis for polarizing morphological charac- 
ters. Thus, voucher specimens were used extensively. For the sake of methodologi- 
cal homogeneity, the Sibley-Ahlquist sequences of nodes were used wherever 
viewpoints were contradictory. For all parts of the phylogenies that were not sup- 
ported by DNA-DNA hybridizations the nodes (equating N species — 1) were 
spaced regularly above the "baseline" Т5, Н value. Several branches were discarded 
from the analysis because baseline dating was not possible owing to lack of original 
data. To make the choice of species as unbiased as possible, acceptable phylogenies 
that met our criteria and identified recent radiations and old lineages were recon- 
structed (Fig. 12.1), before the geographical analysis was initiated. 

Phylogenetic lineages were identified and mapped. Groupings were assigned ac- 
cording to the age of the lineage: (1) "new species" for radiations with at least 10 
species emerging after T3, H 2.5 (1.е., during the last 6- MY period with increasing 
climatic fluctuations of the Croll-Milankovitch type), and (2) “old species" (mostly 
monotypic genera) representing a single lineage from before Т; H 2.5 (in the origi- 
nal analysis by Fjeldsa, 1994, this category was divided in two subcategories). As the 
categories are separated by a major evolutionary dichotomy, the errors will at most 
consist of inclusion (in group 1) of a few radiations that started slightly earlier than 
ТЫН 2.5, or the inclusion (in group 2) of a few monophyletic lineages that are 
slightly too young. For the African tropics 233 species were identified as recent 
radiations and 82 species as older branches; for South America the corresponding 
samples comprised 648 and 107 species (see Fjeldsà, 1994, for taxa used). The ra- 
tio of new and old species was calculated for each 200 X 200 km geographic grid 
cell over both continents, using distributional data contained in standard reference 
books for the two continental avifaunas (Fig. 12.2). 


III. OLD AND NEW SPECIES IN SOUTH AMERICA 


The intensive folding in the tropical Andes region since the Miocene blocked the 
earlier outlet of the Amazon into the Pacific Ocean, thereby altering the patchwork 
of different habitats, and potential biogeographic barriers, over large parts of the 
continent. A number of new habitats were formed in the Andes, and a geological 
subsidence, creating a hydrologically unstable zone in the Chaco, effectively iso- 
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FIGURE 12.1 An example of the two categories of phylogenetic reconstructions representing (A) a 
rapid early radiation (probably in the early Miocene) and slow rates of later diversification (bush and 
helmet shrikes and vangas, Malaconotinae; for earlier discussions see Meise, 1968; Benson et al., 1971; 
Traylor, 1970; Sibley and Ahlquist, 1990) and (B) a group that first radiated mainly in the Pacific and 
Indo-Malayan areas and showed a strong recent (Plio-Pleistocene) radiation in Africa (starlings, Stur- 
nidae; last revised by Amadon, 1956; Beecher, 1978). Small terminal dots identify lineages identified as 
“old species” while bracketed groups identify radiations of “new species" (see text for explanation). 
Heavy lines indicate branching order as determined by Sibley and Ahlquist (1990), the rest were recon- 
structed according to our criteria (see text). 


lated the Andean biota from that of the Brazilian Highland (Hanagarth, 1993; Silva, 
1995). Today, humid forests form a continuous band along the eastern slope of the 
Andes, while the upper cloudforest zone is dissected by deep valleys with arid cli- 
mates on the bottom. 

The distribution of nodes connecting South American taxa on the Sibley— 
Ahlquist phylogenies shows an explosive burst of differentiation starting from Т; H 5 
(Fjeldsá, 1994, Fig. 1). Figure 12.2A clearly demonstrates that this differentiation 
was most intensive in those parts of the tropical zone that were affected by mountain 
folding and tectonic changes. By far the most important area is the tropical Andes 
region and its transition toward the Amazon lowlands. Conversely, the proportion 
of young species is moderate to low in much of the Amazon lowland (Fig. 12.2A), 
being highest in regions with tectonically active crystalline arcs and lowest in those 
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FIGURE 12.2 Geographical variation in (A) South America and (B) Africa, of the ratio between 
"new species" representing recent radiations and “old species" (modified from Fjeldsà, 1994). Black- 
ened areas indicate a ratio of more than 3 new species to old species; narrow spaced vertical lines, more 
than 2.5; horizontal lines, more than 2; and widely spaced vertical lines, more than 1.5. Asterisks mark 
an aggregate of biogeographic relicts (sensu Cronk, 1992), which may indicate long-term ecoclimatic 
stability. In Africa these regions are the Cameroon Mountains, Angola Scarp, Rwenzori Mountains/ 
Itombwe Forest in eastern Zaire, Uluguru, Udzungwa and east Usumbara Mountains in Tanzania. In- 
terrupted signatures indicate that the species density is too low for calculating reliable ratios. 


parts that are characterized by high fluvial disturbance leading to complex and dy- 
namic habitat mosaics of “fossil” and active floodplains (see Kalliola et al., 1993). It 
is noteworthy that more than 8096 of "new species" occupying the Amazon low- 
lands also inhabit regions lying outside of the lowland forest biome, thereby ob- 
scuring their origins. Also other hydrologically unstable areas (Chaco, Llanos, and 
northwestern Colombia) are characterized by old species, with little recent differ- 
entiation. 


IV. OLD AND NEW SPECIES IN AFRICA 


Unlike South America, Africa has been little affected by mountain folding, the most 
significant geological changes affecting the forest biota being (1) a general drying of 
northern Africa after the continent "collided" with Asia in the early Miocene and 
the Tethys Sea was closed (see Axelrod and Raven, 1978) and (2) uplifting and 
rifting isolating the eastern lowland forests from the main Guinea-Congolian rain- 
forest block during the Miocene (Lovett, 1993; Coppes, 1994). The montane for- 
ests are continuous on the transition between the Congo Basin and the Albertine 
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FIGURE 12.3 Geographical patterns of the density of highly distinctive forest bird species that can be 
regarded as (A) biogeographical relicts and (B) aggregates of neoendemics |"new species" with restricted 
range («50,000 km?) representing vicariance patterns]. Number of species of each category per grid 
square (as described in text) is indicated; large numbers in boldface represent highest densities of species 
in each category. Note a correlation in regions that contain both high densities of biogeographical relics 
and neoendemics (see text). Species defined as relictual comprise Afropavo congensis, “Francolinus” nahani, 
Xenoperdix udzungwensis, Himantornis haematopus, Canirallus oculeus, Phodilus prigoginei, Otus irenae, Pseu- 
docalyptomena graueri, Malaconotus alius, Prionops alberti, Arcanator orostruthus, Swynnertonia swynnertoni, 
Hemitesia neumanni, Graueria vittata, Orthtomus metopias, Apalis moreaui, Bathmocercus winnifredae, Anthreptes 
pallidigaster, Nectarinnia rufipennis, Ploceus golandi. 


Rift, but otherwise are discontinuous (Cameroon, Angola, Ethiopia highland) with 
a large but punctuated “montane circle” following the Albertine and Malawi rifts, 
the Eastern Arc crystalline fault-blocks in Tanzania and the Kenya highlands. Along 
this “circle,” it is assumed that the ecoclimatically most stable parts are in upper 
Zaire and on east-facing escarpments of the Eastern Arc mountains, which are un- 
der direct climatic influence from the Indian Ocean (Lovett, 1993). The distribu- 
tion of relictual species of forest birds seems to locate this stability quite precisely to 
west of the Rwenzori Slope to Itombwe forest in upper Zaire and the Udzungwa 
Scarp and Uluguru and East Usambara Mountains in Tanzania (Fig. 12.3). 

The most recent diversification in Africa is associated with geological swells 
in the savanna zone, and mountains, while old species dominate in the extensive 
lowland forests and in hydrologically unstable lowlands (Chad and Sud Basins; 
Fig. 12.2B). A study of only forest birds (and forest plants, Fjeldså and Lovett, 1997) 
further emphasizes that the diversification in the Pleistocene was associated with 
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highlands outside the Congo Basin, and the “montane circle" in eastern Africa in 
particular. “New species" occurring inside the Congolian rainforest are generally 
widespread or disjunct, and as with South America more than 8096 of them also live 
elsewhere on the continent, obscuring their origins (Fig. 12.2). 


V. IMPLICATIONS FOR THE MODEL 


Species arise by dynamic processes acting in montane regions, but persist in flood- 
plains. The species-rich tropical lowland forests were often regarded as “centers of 
origin" or “dispersal centers" for tropical biodiversity. Over the last three decades, 
the prevailing explanation for the origin ofthe extraordinary biodiversity of tropical 
lowland forests was the refuge theory. Proposed initially for South American birds by 
Haffer (1969, 1974), this theory was soon applied to the African avifauna (Diamond 
and Hamilton, 1980; Mayr and O'Hara, 1986; Crowe and Crowe, 1982) as well 
as other groups, such as plants, and insects in different tropical regions (reviews 
in Prance, 1982; Whitmore and Prance, 1987). The theory assumes that species 
evolved by isolation in forest areas that remained stable despite global ecoclimatic 
changes. The impact of these changes, forced by cyclical changes in the earth's orbit 
(Milankovitch cycles), has been accentuated by a general global cooling during the 
last few million years, with large glacial peaks (or arid periods in the tropics) during 
the last 0.9 MY (see Bartlein and Prentice, 1989; Bennett, 1990; Hooghiemstra 
et al., 1993). 

Lowland floodplains are dominated by species of pre-Pleistocene age, with no 
particular concentration of younger species in the postulated refuge areas (Fig. 12.2; 
see Amorim, 1991, for similar arguments). Instead, much more speciation takes 
place in areas with a distinctive topographic structure (see also Vrba, 1993). Since a 
large proportion of young species in lowland rainforests are widespread, often ex- 
tending to forested escarpments or gallery forests in adjacent biomes, alternative 
explanations of the initial speciation events are indeed possible (notably for Africa). 

Undoubtedly, a great deal of differentiation is due to vicariance caused by tec- 
tonism and erosion creating isolating barriers. However, a marked correlation be- 
tween "hotspots" for species that are part of vicariance patterns (Fig. 12.3B) and 
peak concentrations of relictual forms (Fig. 12.3A) suggests an association between 
speciation and intrinsic properties of specific mountain scarps (see Fjeldsà, 1995, for 
the Andes; see Stebbins and Major, 1965, for the Californian flora). Where preva- 
lent atmospheric flows interact with topography, climate can be moderate, creating 
stable cloudforest conditions locally (Fjeldsà et al., 1997). In this case, the initial 
isolating mechanism is not necessarily a physical barrier between hotspots but could 
be their intrinsic high spatiotemporal heterogeneity that produces high species 
turnover and robust communities, where specialist species cannot easily remain es- 
tablished. This would form the basis for a highly dynamic process of isolation and 
opportunities for short-term dispersal between hotspots. 
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As a working hypothesis we postulate that the evolution of tropical forest birds 
is driven by a dynamic process of local isolation in stable montane forests with oc- 
casional dispersal between them, and that new species may gradually expand into 
other habitats, and in the end accumulate in the extensive tracts of lowland forest ог 
woodland savannas. In the Amazon basin, the amount of tectonically induced 
flooding, and the dynamics of meandering rivers in the Amazon basin, make it 
evident that biodiversity is redistributed (Kalliola et al., 1993). Tropical lowland 
forests are highly unstable on the local scale, and we suggest that this high level 
of spatiotemporal heterogeneity makes them act as “museums” where large num- 
bers of species (of potentially diverse origins) have accumulated over long periods 
of time. 

Critical evaluation of whether hotspots are centers of origin or reflect subse- 
quent redistribution, and whether a dynamic speciation process in montane areas 
can deliver recruits to lowland biota, is needed. In the following section we describe 
three studies that assess specific questions of patterns and processes of distribution 
of montane avifauna in both Africa and South America. 


VI. CASE STUDIES OF BIOGEOGRAPHIC PATTERNS 
IN TROPICAL MOUNTAINS 


A. The Andes 


1. Flycatchers of the genus Leptopogon are represented by four species that occur 
in the South American lowlands and highlands. All four species are forest dwell- 
ing, whose distribution near the Andes can be summarized as follows: (a) L. amau- 
rocephalus, tropical lowland (to 600 m); (b) L. superciliaris, upper-tropical (600 to 
2100 т); (с) L. taczanowskii, upper subtropical (1600 to 2700 m); and (d) L. rufipec- 
tus, upper subtropical (1600 to 2700 m). Leptopogon taczanowskii and L. rufipectus are 
allopatric taxa that are separated by the River Marañon valley, in northern Peru 
(Traylor, 1979; Fjeldsa and Krabbe, 1990; Bates and Zink, 1994). 

Bates and Zink (1994) studied the phylogenetic relationships of these four spe- 
cies, using both allozymes and mitochondrial DNA (mtDNA). They found that 
(1) Leptopogon is indeed a monophyletic genus; (2) L. amaurocephalus is the basal 
member of the genus; and (3) L. superciliaris is the sister-group of the clade formed 
by L. taczanowskii and L. rufipectus. This pattern supports the hypothesis that the 
evolution of Leptopogon species was driven by vicariance events that basically fol- 
lowed the uplift of the Andes during the late Tertiary. In fact, by using published 
estimates of molecular-clock calibrations for both allozymes and mtDNA, Bates 
and Zink (1994) found a good correlation between genetic differentiation within 
Leptopogon and the timing of uplift of the Bolivian Andes. 

2. Tapaculos of the genus Scytalopus (family Rhinocryptidae) are small (20— 
40 g), sooty-gray-colored birds that inhabit the dense understory of humid forest 
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and scrub in the Andes, Central America, and eastern and central Brazil (Sibley and 
Monroe, 1990; Ridgeley and Tudor, 1994). Traditional systematics using plumage 
characters have recognized 11 species in this genus, but detailed studies including 
vocal characters suggest that many more species are involved (Krabbe and Schulen- 
berg, 1997). 

Most species of Scytalopus, as they are currently defined, have complex patterns 
of distribution and geographic variation along the Andes. For instance, species that 
are similar in plumage tend to replace each other in different altitudinal zones 
(Fjeldsà and Krabbe, 1990). The simplest model for explaining this pattern could be 
one based on parapatric speciation along an environmental gradient, such as sug- 
gested by Endler (1977). In this case, species along a gradient are predicted to be 
monophyletic and no dispersal events are needed to explain this distribution pat- 
tern. Alternatively, one can propose that the pattern of altitudinal replacement of 
species along a mountain slope is a secondary event, caused by dispersal between 
different slopes. In this model, species found on the same slope are not predicted 
to be monophyletic. These species could have evolved by an allopatric (vicariant) 
mode of speciation due to isolation in different mountain ranges and thereafter 
dispersed between slopes, establishing the distribution pattern currently observed. 
As these two models suggest different sets of phylogenetic relationships (Patton 
and Smith, 1992), they can be tested by examining the phylogenetic relationships 
of populations. 

Arctander and Fjeldsà (1994) evaluated these models by studying 14 tapaculo 
taxa, some of which have adjacent distributions along the same mountain slope in 
Ecuador. They compared 285 bp of mitochondrial cytochrome b (cyt b) from each 
species in order to elucidate their phylogenetic relationships. Although they used 
only a small fragment of cyt b, pairs of taxa that have distinctive songs and live 
essentially in sympatry (i.e., those that are distinct biological species) differed from 
each other by 23-42 transitions (ts; average 28.3) and 1— 13 transversions (tv; av- 
erage 4.7), with a ts-to-tv ratio of 6:1. 

On the basis of the preliminary phylogeny of the 14 taxa evaluated so far and 
4 taxon tests used to evaluate specific hypotheses, Arctander and Fjeldsa (1994) 
showed that in the 2 areas that have the highest number of parapatric species (5 
species on one slope) every one of these was more closely related to a taxon inhab- 
iting another mountain range than to its nearest neighbor on the same slope. Thus 
the parapatric model was falsified, and it is suggested that divergence in the Andean 
Scytalopus was allopatric, in small disjunct isolates in different parts of the Andes, 
and that the currently observed pattern of replacement in different altitudinal belts 
Is a secondary event. 


B. East African Mountains 


The circle of mountain islands of East Africa are composed of the Albertine and 
Malawi Rift Mountains, the Tanzanian Eastern Arc Mountains, and the Kenya 
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FIGURE 12.4 Distribution of the three montane Andropadus species of East Africa. Blackened areas 
indicate known species range with approximate distribution of relevant subspecies (in italic) as discussed 
in text. In addition, suggested dispersal routes are indicated by arrows and highland regions referred to 
in text are shown on the A. tephrolaemus distribution map. 


Highlands (Figs. 12.2 and 12.4). Biogeographically linked to these mountains are 
the Cameroon Mountains and Angola Scarp. To elucidate the complex patterns of 
vicariance and dispersal between the mountain islands of Africa, we focused our 
attention on the strongly polytypic greenbuls of the genus Andropadus. 

The genus Andropadus is represented by 11 obligate forest-dwelling species. 
Once united with Pycnonotus (bulbuls) (Delacour, 1943; Rand, 1958), they were 
separated by White (1962) and Hall and Moreau (1970). They differ in plumage 
and ecology, and possibly origins, with Pycnonotus being a primarily Asian genus. 
Four species (A. tephrolaemus, A. masukuensis, A. milanjensis, and A. montanus) are 
strictly montane and represent (according to our current evaluation) a recently ra- 
diated, monophyletic group. All except for A. montanus are widely sympatric in 
eastern Africa (Fig. 12.4), suggesting that the initial vicariance events leading to 
speciation were followed by periods of range expansion. Taxonomically this group 
of mountain greenbuls remains obscure and efforts to identify evolutionary rela- 
tionships between species and taxonomic status of isolated populations within each 
species remain inconclusive, having been largely influenced by geographic rather 
than sound cladistic analysis (see Dowsett and Dowsett-Lemaire, 1993). 

Most intriguing is the relationship between A. masukuensis and A. tephrolae- 
mus, which are morphologically similar and show broadly overlapping distributions 
around the “montane circle.” Andropadus masukuensis is subdivided into four rela- 
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tively indistinct gray-headed (kakamegae/ kungwensis along the Albertine Rift and 
Kenya) and green-headed forms (masukuensis/roehli along the Eastern Arc; Fig. 12.4). 
Conversely, A. tephrolaemus is subdivided into six distinctive forms within the mon- 
tane circle (Fig. 12.4), replacing each other on different mountains (kikuyensis along 
the Albertine Rift into Kenya, nigriceps in southern Kenya/northern Tanzania, usam- 
barae in Pare and Usambara mountains of the northern Eastern Arc, neumanni in 
Usambara mountains, chlorigula along the southern Eastern Arc mountains, fusciceps 
along the Malawi Rift). Andropadus masukuensis was once treated as a subspecies of 
A. montanus (White, 1962; separation suggested by Stuart, 1986; Dowsett-Lemaire, 
1989), which is sympatric with A. t. tephrolaemus and bamendae in the Cameroon 
highlands. Interestingly, however, Keith et al. (1992) suggest that A. masukuensis is in 
fact more related to A. t. tephrolaemus. Andropadus milanjensis has a "southerly" dis- 
tribution extending from the Malawi Mountains of Mozambique and up through 
Malawi to Tanzania and the northern Eastern Arc Mountains (Fig. 12.4). Southern 
populations resemble A. tephrolaemus whereas those of the north are more distinct. 

In this analysis we compared 597 bp of the cyt b gene sequence of the mtDNA 
from individuals of each species (Fig. 12.5). Both 5' (297 bp) and 3' (300 bp) ends 
ofthe gene were amplified and sequenced in both directions using universal primers 
(Kocher et al., 1989; Edwards et al., 1991) (see caption to Fig. 12.5). Sequences were 
aligned by eye, and no insertions or deletions were found, as expected from a coding 
region. Aligned sequences were then analyzed using maximum parsimony (PAUP; 
Swofford, 1991). The samples included six subspecies of A. tephrolaemus (chlorigula 
and neumanni from the southern Eastern Arc, usambarae and nigriceps from the north- 
ern Eastern Arc and Kenya Highlands, kikuyensis from the northern Albertine Rift, 
and tephrolaemus from Cameroon), six samples of A. masukuensis roehli from com- 
parable regions of the entire Eastern Arc, two A. milanjensis striifacies from different 
regions within the Eastern Arc, and Phyllastrephus flavostriatus as an outgroup. Our 
objectives were to investigate the evolutionary histories within and between these 
species, specifically to assess (1) the degree of genetic divergence and phylogenetic 
relationships within and between A. masukuensis and A. tephrolaemus, (2) dispersal 
and vicariance patterns between Albertine Rift and Eastern Arc Mountains, and 
(3) the potential interchange between the Cameroon Mountains and the “montane 
circle” of eastern Africa. 

Phylogenetic analyses of these populations suggest complex historical inter- 
changes between different montane areas, including an early vicariance and diver- 
gence of A. milanjensis (Fig. 12.5). However, the most interesting aspect is that A. 
tephrolaemus is paraphyletic in relation to the largely sympatric A. masukuensis. The 
genetic divergence among Tanzanian forms of A. tephrolaemus (N. Arc and S. Arc 
in Table I) is much larger than between A. masukuensis from the same geographical 
area (Table I), corresponding well with the difference in morphological divergence, 
and suggesting long isolation of different lineages of A. tephrolaemus. Andropadus 
masukuensis shows little genetic variation between populations separated by consid- 
erable geographical distances, indicating that its differentiation may have occurred 
only recently. 
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FIGURE 12.5 Phylogeny of Andropadus species based оп 597 bp of mitochondrial cytochrome b. 
Species and regional distribution are indicated. One of two equally parsimonious trees is shown; the 
other tree differed only in the branching order of A. masukuensis from the southern Eastern Arc. Se- 
quence data were analyzed by the exhaustive search option of PAUP 3.1.1 (Swofford, 1991), giving all 
positions equal weight. A mean distance measure is indicated as is percentage bootstrap support for nodes 
from 1000 replicates. The tree has a consistency index of 0.606 excluding uninformative characters and 
a tree length of 387 steps. PCR and sequencing were performed using mtDNA primers 14841 with 
15149 (Kocher et al., 1989), and 15564 with 15915 (Edwards et al., 1991). 


Andropadus tephrolaemus is divided into three geographical groups consisting of 
(1) the southern Eastern Arc, (2) the northern Eastern Arc/Kenya/northern Alber- 
tine Rift, and (3) the Cameroon Mountains. The nodes between these branches are 
all well supported and it is surprising that A. t. tephrolaemus is not an immediate 
sister-taxon to the geographically closest A. t. kikuyensis (see Fig. 12.4). The geo- 
graphic distance between the nearest points of the Udzungwa Scarp (chlorigula) and 
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TABLEI Genetic Distances Averaged over Samples, Corrected for Missing Data 


Andropadus tephrolaemus 





Andropadus 
masukuensis N. Arc N. Albert S. Arc Cam 
Andropadus masukuensis 
Andropadus tephrolaemus 0.0114“ 
М. Arc 0.132 0.0444 
N. Albert 0.116 0.069 0 
S. Arc 0.13 0.109 0.098 0.076* 
Cam 0.082 0.124 0.108 0.119 0 


° Average intrapopulation distances are also given where appropriate (from PAUP 3.1.1; Swofford 
1991). 

Abbreviations: М. Arc, northern Eastern Arc; S. Arc, southern Eastern Arc; М. Albert, northern АЈ- 
bertine Rift; Cam, Cameroon Mountains. 


Uluguru Mountains (neumanni) is 75 km and the genetic distance is 0. 076. The 
distance between ranges inhabited by northern and southern groups (Mount Kanga 
with chlorigula to East Usambaras with usambarae) is 135 km and yet the average 
genetic distance between the southern and northern groups is 0.112 (Table I). This 
can be compared to a geographic distance of 950 km and genetic distance of 0.069 
between northern Albertine Rift and northern Eastern Arc (Table I). This result 
may indicate that dispersal of an A. tephrolaemus ancestor to the northern and south- 
ern Eastern Arc may have been from different directions around the “montane 
circle.” 

During moist parts of interglacial periods, forest formed intermittently along the 
“northern dispersal corridor,” permitting expansion through the Kenya Highlands 
to the humid Usambara Mountains (see Hamilton, 1982). The Udzungwa and Ulu- 
guru Mountains are assumed to have been permanently affected by humidity from 
the Indian Ocean, permitting persistence of isolated A. t. chlorigula and A. t. neu- 
manni populations here (see Fig. 12.3). 

Interestingly, in contrast to A. tephrolaemus, A. masukuensis displays little genetic 
variation between populations, over a large geographical range, indicating that al- 
though closely related, these two species have had different evolutionary histories. 
The resulting phylogenetic reconstruction indicates periods of vicariance through 
isolation accompanied by intermittent dispersal. Such a scenario could explain the 
origin and distribution of A. masukuensis, which appears to have originated from an 
A. tephrolaemus-like ancestor, perhaps after dispersal and isolation on Mount Cam- 
eroon. Subsequent reinvasion of the Albertine Rift Mountains then allowed it to 
confront its "parental species" as an independent species. The close association be- 
tween A. t. tephrolaemus and A. masukuensis revealed by this analysis had been sus- 
pected by Keith et al. (1992). Interestingly, A. masukuensis shows a south to north 
cline (although weakly supported by the phylogenetic reconstruction; Fig. 12.5), 
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indicating that invasion of the Eastern Arc from the Albertine Rift could have been 
from a "southern dispersal corridor" (Fig. 12.4). 


VII. DISCUSSION AND SUMMARY 


The case studies described above illustrate some aspects of the biogeographic dy- 
namics that must be considered in order to understand the processes involved in the 
evolution of the rich avifauna of tropical forests. While Leptopogon shows a simple 
pattern of differentiation that correlated with the uplift of the Andes, Scytalopus 
illustrates a more complex case in which ancient differentiation caused by geologi- 
cal factors has been covered by new cycles of dispersal and vicariance that may 
be tentatively associated with Quaternary climatic—vegetational cycles. These two 
cases are possibly extremes of a continuum of patterns that will eventually include 
patterns of phylogenetic relationships showing a series of multiple biotic inter- 
changes between mountain and lowland avifaunas. With regard to Africa, relation- 
ships within Andropadus show that a recent dynamic process of vicariance-induced 
speciation followed by dispersal has occurred within the mountains of East Africa. 
In addition, a complex interchange between East Africa and the Cameroon Moun- 
tains has also been revealed. 

It is evident that a great deal of avian diversity has been generated in recent times 
within tropical montane regions and that further studies are needed to understand 
its magnitude. In particular, the prediction in the model hypothesis that lowland 
rainforest biota are recruited from radiations in montane regions needs to be fully 
tested using phylogenetic studies. Nothing can be concluded from species that now 
live only in lowlands, but a well-resolved population phylogeny of species that ex- 
tend across Africa in lowlands as well as montane regions (A. virens, curvirostris, and 
gracilirostris) could provide more conclusive evidence. Leptopogon speciation could 
be interpreted in two ways, and a comprehensive study of population structure is 
needed to evaluate the possible interpretation that old species move into the low- 
lands as new species arise by vicariance in the mountains. A preliminary phyloge- 
netic study of spinetails (genus Cranioleuca) in our laboratory, indicates that one 
species inhabiting lowland Amazonia comes out within the Andean species group, 
suggesting an interchange between these two regions (J. Garcia-Moreno, unpub- 
lished data). 

Several theories have been proposed to explain the geographic patterns of species 
richness and endemism in tropical forest biomes. None has received more atten- 
tion than the refuge theory, which has become the predominant model for explain- 
ing the origin and biogeography of tropical forest organisms for the last three de- 
cades (Prance, 1982; Simpson and Haffer, 1978; Whitmore and Prance, 1987). We 
showed that by using a simple test that combines DNA-DNA hybridization with 
distribution data the refuge theory, in its original form (Haffer, 1969, 1974), cannot 
account for the most recent bursts of speciation of South American and African 
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lowland avifaunas. In fact, molecular studies in lowland groups of birds (Capparella, 
1988; Gerwin and Zink, 1989; Gill and Gerwin, 1989; Hackett, 1993; Hackett and 
Rosenberg, 1990), mammals (Patton et al., 1994), and frogs (Heyer and Maxson, 
1982) indicate high genetic divergence between pairs of closely related species. On 
the basis of different molecular clock calibrations, these levels of divergence among 
species suggest that most of the speciation in tropical lowland biotas occurred before 
the Quaternary. 

This analysis provides evidence for bursts of speciation in montane regions dur- 
ing the Quaternary climatic—vegetational fluctuations and does not support the hy- 
pothesis that avian diversification was intensive in the lowland regions during this 
period. We suggest that since montane regions are highly heterogeneous with re- 
gard to vegetation, climate, and topography, there is a good chance that areas of 
paleoecological stability may exist as small pockets within them. These stable areas 
would be consistent in terms of climate and vegetational cover throughout periods 
of shifting global climate and would, in a sense, act as small refuges (sensu Brown 
and Ab'Saber, 1979; Vrba, 1992). It is not necessary to invoke barriers of open- 
vegetation habitats to explain the isolation of populations of forest birds in tropical 
mountains. From a metapopulation perspective, range disjunction can also arise as 
a consequence of ecoclimatic instability affecting community composition (Gilpin 
and Hanski, 1991). Because the assumed stable areas are small, the populations of 
animals and plants that would live in them would themselves be of small size. This 
could lead to rapid divergence from parent populations owing to rapid fixation of 
alleles and founder effect (Avise, 1994). 

In our case studies, we have illustrated that forest montane avifaunas may have 
been assembled by a combination of factors. They include local to regional eco- 
logical changes caused by geological events (e.g., tectonism) to continental to global 
ecological changes caused by Croll-Milankovitch climatic cycles. In some groups 
of birds (e.g., Leptopogon), ancient processes of speciation caused mostly by geo- 
logical changes are still conspicuous and recoverable. In other groups (Scytalopus and 
Andropadus), ancient processes of speciation have been hidden by new cycles of 
vicariance and dispersal associated with Quaternary climatic—vegetational cycles. In 
these groups, ancient processes of speciation are not recoverable, but the most re- 
cent ones are. 

Molecular studies are a valuable part of any modern biogeographic analysis since 
they open the possibility for the development of rigorous protocols that can be used 
to untangle the patterns of species distributions from the processes that caused them. 
These protocols can be applied to a wide range of biotas, including complex ones 
such as tropical montane avifaunas. 
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I. INTRODUCTION 


During the first decade of ancient DNA analysis the concept of analyzing prehis- 
toric genetic information has emerged from obscurity to become a relatively well- 
known and mainstream pursuit (Higuchi ef al., 1984; Crichton, 1991; Paabo, 1993). 
Ancient DNA techniques are particularly suited to the study of avian evolution 
since birds make up a disproportionate number of the world's recently extinct and 
currently threatened taxa. Extensive museum collections of avian skins and skele- 
tons, often significant contributors to this situation, are now a considerable resource 
for systematics research. Many extinct avian taxa were the results of evolution 
within ecosystems that have since disappeared, and represent unique unrepeatable 
experiments. In these situations ancient DNA techniques may allow “lost” genetic 
information to be used to reconstruct these evolutionary paths and assist in the 
conservation of remaining biota. 

The study of preserved macromolecules has provided access to prehistoric ge- 
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netic information and allowed molecular evolutionary change to be examined in 
real time, rather than through extrapolation from DNA of living organisms. Ancient 
DNA has been used in subjects as diverse as systematics, population genetics, paleo- 
ecology, archaeology, conservation biology, and forensics (Pääbo, 1993) creating 
new importance for museum specimens, and changing the role of museums them- 
selves (Houde and Braun, 1988). 

Since its discovery, the polymerase chain reaction (PCR) (Saiki et al., 1988) has 
formed the basis of nearly all ancient DNA research because it permits a designated 
sequence to be amplified from a small number of damaged templates amidst a back- 
ground of nonspecific DNA (Pääbo and Wilson, 1988). From 1985 through to the 
mid-1990s the template of choice for most vertebrate ancient DNA studies has been 
mitochondrial DNA (mtDNA), primarily because of the high number of mito- 
chondrial genomes per cell and the prevalence of its use in other systematic and 
population studies. Because ancient DNA is invariably damaged, studies have gen- 
erally been restricted to sequences of less than 500 base pairs (bp), but until recently 
this has been sufficient for phylogenetic comparisons with extant taxa. 

Nuclear DNA sequences are a powerful complement to mtDNA data because 
they provide an independently inherited set of molecular characters. Rapidly evolv- 
ing nuclear microsatellites are particularly useful for studies of population genetics, 
and avian microsatellites have already been successfully amplified from museum 
specimens (Ellegren, 1991, 1993; Roy et al., 1994). Unfortunately, the compara- 
tively low ratio of single-copy nuclear to mitochondrial genes in most cells means 
that the range of ancient samples likely to yield single-copy nuclear sequences is 
far less than that for mtDNA. Nevertheless, the amount of information accessible 
through nuclear sequence data will ensure that both genomes feature in future an- 
cient DNA studies. 

Ancient DNA research has obvious utility in determining the systematic rela- 
tionships of extinct taxa, but may also clarify the evolutionary relationships of extant 
relatives by balancing taxon sampling across parts of the tree (Cooper et al., 1992; 
Höss et al., 1995). However, more substantial evolutionary questions can be ad- 
dressed if the data are combined with information from other fields such as pale- 
ontology and ecology. 


П. REVIEW OF ANCIENT DNA RESEARCH 


The majority of ancient DNA publications have concerned the systematic position 
of extinct taxa (Thomas et al., 1989; Cooper et al., 1992; Hóss et al., 1995; Christidis 
et al., 1996; Houde et al., Chapter 5 in this volume), although studies involving 
taxon identification (Hagelberg et al., 1989, 1991; Höss et al., 1992; Sallares et al., 
1995; Cooper et al., 1996) and population genetics (Thomas et al., 1990; Wayne 
and Jenks, 1991; Stone and Stoneking, 1993; Roy et al., 1994) have become more 
prevalent. Although there has been a surge in the number of evolutionary questions 
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addressed using preserved macromolecules, appreciation of the practical difficulties 
inherent in ancient DNA research has lagged behind. 

Mammoths, mummies, and dinosaurs are the stuff of legends, and it is the re- 
sponsibility of all those involved in ancient DNA research that scientific publications 
do not also fall into this category. The majority of published ancient DNA studies 
involve late Pleistocene or Holocene specimens (50,000 years old to present) and 
therefore do not conflict with observed DNA decay rates (Lindahl, 1993a,b). In 
contrast, reports of DNA preservation over many millions of years (Golenberg et al., 
1990; Soltis et al., 1992; Cano et al., 1993; Poinar et al., 1993; Woodward et al., 
1994) have attracted considerable publicity and scrutiny, yet are still to be indepen- 
dently verified (Sidow et al., 1991). Even if such studies are legitimate, they are 
valueless without independent replication. Perhaps the most important point these 
studies demonstrate is the central role of authentication procedures in ancient DNA 
research (Pääbo et al., 1989; Lindahl, 1993b; Handt et al., 19943, 1996). 

Difficulties in demonstrating the authenticity of ancient DNA sequences vary 
considerably, and in some cases only circumstantial supporting evidence exists 
(Нап et al., 19942). Cryptic contamination is a significant problem when con- 
taminating DNA, often from within the laboratory, is similar or identical to real 
ancient sequences. Systematic studies of ancient DNA are somewhat less susceptible 
to this problem than population studies, because sequences can be phylogenetically 
contrasted with extant taxa. However, cloning artifacts (Higuchi et al., 1987; Pääbo 
and Wilson, 1988), damaged modern contaminant DNA (Collura and Stewart, 
1995), and chimeric sequences (DeSalle et al., 1993; Hackett et al., 1995) can still 
confuse analysis. A further complication is nuclear copies of mtDNA genes, which 
may appear functional (Quinn and White, 1987; Arctander, 1995; Collura and 
Stewart, 1995; Zischler et al., 1995; Sorenson and Fleischer, 1996) and can act as 
default ancestral sequences. Nuclear pseudogenes will be more problematical if 
samples have a high ratio of nuclear DNA to mtDNA, such as nucleated avian 
erythrocytes. 

As the difficulties involved in preventing contamination became appreciated, new 
criteria have been adopted. Authentication techniques currently in use emphasize 
the independent replication of results (Handt et al., 1994b; Taylor, 1996) and re- 
porting of failed attempts (Наппі et al., 1994). Recent developments include cloning 
PCR products to examine individual sequences when ambiguities exist in direct 
sequences (Handt et al., 1996), and examining the sample for evidence of suitable 
preservation by measuring amino acid racemization (Poinar et al., 1996) or histo- 
logical preservation and nitrogen content (Hedges et al., 1995; Colson et al., 1997). 


A. Techniques 


Because a variety of source materials are used in ancient avian DNA studies, some 
of the commonly used extraction procedures are briefly reviewed below. A broader 
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discussion of extraction and amplification techniques can be found in Ancient DNA 
(Herrmann and Hummel, 1994). 

Preserved bone is often a better source of DNA than surrounding tissue (Cooper 
et al., 1992) and this is also apparent in the macroscopic preservation of many re- 
mains. Compact bone from weight-bearing limbs appears to be a reliable source of 
well-preserved DNA, whereas yields from cancellous (trabeculae or marrow) bone 
are poor and the risk of environmental contamination is increased. Desirable bones 
for sampling exhibit few external signs of diagenesis, such as cancellous bone show- 
ing through damaged epiphyses, cracked surfaces, and bleached or discolored sec- 
tions. To avoid ingrained human DNA from handling of the bone, the surface of 
the sample area should be mechanically removed to the practical maximum. How- 
ever, sweat and dust may penetrate deeply below the surface of samples and the 
potential contribution from this source of DNA should not be underestimated 
(Richards et al., 1995; Handt et al., 1996). 

The speed at which preserved specimens are dehydrated appears to be a signifi- 
cant factor controlling the size of amplifiable DNA fragments, and this is presum- 
ably related to the period during which endogenous endonucleases remain acti- 
vated (Pääbo, 1993). Accordingly, DNA from museum specimens that are prepared 
quickly may permit relatively long PCR amplifications («1000 bp), in contrast to 
naturally preserved mummies or bones, from which smaller amplifications («400 
bp) are normal. Tissue remains are often the most accessible in museum specimens, 
but bone samples are advisable when there is a need for long sequences or protec- 
tion from some external treatment (such as alum, varnish, arsenic, or shellac) that 
may be detrimental to enzyme activity. It is helpful that bones such as phalanges and 
sections of humerus are often left in prepared skins, but if they are not available then 
tissue samples of thick skin from the extremities of the specimen (such as toe pads) 
may do. As a general rule for museum specimens, 0.1—1.0 g of bone or 2-5 mm? 
of tissue is normally sufficient for analysis, although this is dependent on specimen 
preservation. Formalin-fixed samples vary in DNA content, and the processes in- 
volved are often complex (Grody, 1994). 

There are two broad categories of extraction technique presently in use with 
ancient specimens. The traditional technique involves digestion of the sample with 
a proteinase (after, or simultaneously with, decalcification of bone) followed by 
extraction of DNA using organic solvents (Hagelberg et al., 1991; Cooper et al., 
1992). Disadvantages of this technique include the number of procedural steps, 
and the possibility of copurifying inhibitors. The second category uses silica to bind 
DNA in the presence of chaotropic agents (Boom et al., 1990; Höss and Pääbo, 
1993). Silica-based techniques are relatively simple and remove inhibitors effi- 
ciently, although it is possible that they are less efficient at recovering DNA than 
organic extraction. It is also possible to “mix and match" the extraction and puri- 
fication steps from the different techniques. 

The laboratory situation used for ancient DNA research is one of the most im- 
portant aspects of any study. DNA extraction and PCR setup should be conducted 
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in a location physically well separated from PCR products, and a site in a different 
building is strongly recommended. Protective clothing (disposable paper coveralls, 
footwear, and breathing masks) is needed to prevent contamination of samples and 
reagents, especially for population studies, where cryptic contamination is always a 
concern. To fully comprehend these precautions it is necessary to appreciate that 
aerosol droplets from successful PCR reactions can contain enormous amounts of 
amplified DNA, perhaps up to 10,000 copies of a sequence. As a result of this, 
shoes/clothing or drafts can easily transmit amplified DNA fragments as dust be- 
tween work areas via laboratory floor surfaces and air conditioning. Consequently, 
a sensible precaution is to complete research on ancient specimens before working 
on modern relatives that might become cryptic contaminants. 


ПІ. SYSTEMATICS AND 
PALEOECOLOGICAL APPLICATIONS 


The following three projects demonstrate practical applications of the above- 
described techniques and range in time from the Cretaceous [145—65 million years 
ago (MYA)] to the Holocene (10,000 YA to present). Each study uses DNA se- 
quence information from modern and extinct taxa to augment evidence of tem- 
poral change from traditional fields such as paleontology, geology, ecology, and 
biology. 

The first project concerns the evolution of the ratite birds, and illustrates how 
ancient DNA can contribute to systematic studies. New sequence data are presented 
that support the conclusions of an earlier study (Cooper et al., 1992), and ancient 
DNA sequences are shown to be essential for the evaluation of alternative phylo- 
genetic hypotheses. 

The remaining two projects use ancient DNA to investigate Pacific paleoeco- 
systems that have since been drastically altered. The first concerns the effects of a 
paleoecological catastrophe on three endemic New Zealand avian taxa and illus- 
trates how ancient DNA can provide important information about geological events 
in the absence of a fossil record. The second involves an extinct Hawaiian duck 
population, and shows how ancient DNA can reveal information needed for cur- 
rent conservation attempts. 


A. Ratite Systematics 


The living ratite birds are the ostrich (Struthio), emu (Dromaius), cassowaries (Cas- 
uaris), kiwis (Apteryx), and rheas (Rhea) of the southern continents. They are linked 
by several morphological characters including the paleognathous palate and rham- 
phothecal grooves, and share these with the flighted tinamous of South America, 
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which are commonly believed to be their closest living relatives. Despite the seem- 
ingly inordinate amount of research that has followed their scientific recognition 
(reviewed in Sibley and Ahlquist, 1981; Houde, 1988) the phylogenetic relation- 
ships of the ratite birds are still not fully resolved. 

Most recent research supports ratite monophyly (Cracraft, 1974; Sibley and 
Ahlquist, 1981, 1990; Caspers et al., 1994; Cooper and Penny, 1997), although 
Houde and Olson (1981) and Houde (1986) have suggested that fossil paleognath- 
ous birds from the late Paleocene/Eocene of the Northern Hemisphere indicate 
flighted polyphyletic origins. The relationship of these fossil paleognathes to ratites 
is uncertain and they could be ancestral tinamous, sister taxa to either tinamous or 
ratites, or unrelated (Houde, 1988). In addition, the Northern Hemisphere fossil 
taxa are younger than mid-Paleocene rhea fossils (Tambussi et al., 1994; Tambussi, 
1995), so it seems unlikely that they represent ancestral ratites unless the latter 
evolved more than once (Houde, 1988). 

Two of the largest avian species known were members of recently extinct ratite 
groups, namely the Madagascan elephant birds (Aepyornis maximus was approxi- 
mately 500 kg and 2.5 m tall) and the New Zealand moas (Dinornis giganteus could 
reach 3 m and weighed approximately 250 kg; Cooper et al., 1993). Ratite fossils of 
the recently extinct and still living taxa are found on all of the southern continents 
created by the break-up of the Cretaceous supercontinent Gondwana, including 
India (Olson, 1985). Cracraft (1974) used this geographical distribution, and mor- 
phological data, to suggest a vicariant biogeographic origin of the ratites, with a 
basal divergence of the New Zealand kiwi and moa (see Cracraft, Chapter 7 in this 
volume). Subsequent morphological, DNA-DNA hybridization, and mtDNA se- 
quence analyses (Sibley and Ahlquist, 1981, 1990; Bledsoe, 1988; Cooper et al., 
1992) corroborated the vicariant origin hypothesis, but converged on a strikingly 
different phylogeny. Instead, all three studies place the kiwi, emu, and cassowary in 
a derived clade, a situation supported by cytogenetic studies (De Boer, 1980). In 
fact, the three studies differ essentially only in the relative position of the rhea and 
ostrich near the base of the tree, an encouraging degree of concordance (Sheldon 
and Bledsoe, 1993). 

The position of the moa, as well as that of the kiwi, are the central differences 
between the conflicting phylogenetic hypotheses, because otherwise they differ 
only in the position of the root (see Cracraft, Chapter 7 in this volume). Because 
Sibley and Ahlquist (1981) could not analyze the extinct moa using DNA-DNA 
hybridization techniques (although see Houde et al., 1995), the phylogenetic posi- 
tion of the moa lineage had been determined only by morphology (Cracraft, 1974; 
Bledsoe, 1988). The advent of ancient DNA techniques allowed Cooper et al. 
(1992) to include moa mitochondrial sequences in a ratite molecular phylogeny and 
in contrast to Cracraft (1974) the tree placed the moa as a basal lineage, and the 
kiwi as a derived taxa, suggesting New Zealand was invaded twice by ratites. 

Consequently, two areas of disagreement remain about the ratite phylogeny. 
First, the basal and monophyletic New Zealand ratite clade of Cracraft (1974) con- 
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trasts with the derived position of the kiwi in the other three studies. Second, the 
other studies disagree on whether the rhea, ostrich, or both are the extant basal 
lineage (Sibley and Ahlquist, 1981, 1990; Bledsoe, 1988; Cooper et al., 1992). To 
resolve these issues further, the 12S data set of Cooper et al. (1992) is reanalyzed 
using the computer packages PAUP* 4.0d 44—51 (Swofford, 1996) and Spectrum 
1.0.5 (Charleston, 1996) and new phylogenetic techniques. In addition, new data 
sets from the mitochondrial NADH subunit 6 (ND6)/transfer RNA-proline intra- 
genic region and the nuclear protooncogene c-mos are presented. 


1. New Phylogenetic Analyses of the 128 Data Set 


The ratite 12S data set of Cooper et al. (1992) consisted of an approximately 390- 
bp region of domain III, one of the most conserved areas of the mitochondrial 
genome (Mindell and Honeycutt, 1990). Sequences from four (later expanded to 
five, Cooper, 1993) of the six moa genera were presented along with all of the 
extant ratites except for two of the three cassowary species. Phylogenetic analyses 
of the data set with parsimony, distance, and maximum-likelihood methods pro- 
duced the same highly supported tree (Cooper et al., 1992), in which the kiwi and 
moa were not each other's closest relatives as suggested by Cracraft (1974), but 
assumed phylogenetic positions similar to those described by Bledsoe (1988). A 
relative rate test of the data using the tinamous as an outgroup found no significant 
variation within the ratites (Steel et al., 1996). 

The sequence data of Cooper et al. (1992) were aligned with limited reference 
to a secondary structure model of the mitochondrial 12S gene because no appro- 
priate avian model existed. Such a model has become available (Hickson et al., 
1996) and a revised alignment of the data with additional sequences of the tataupa 
and spotted tinamous (Crypturellus tataupa and Northura maculosa, respectively) is 
presented in Fig. 13.1. The new alignment identifies 366 homologous positions, 
of which 105 are variable and 75 are parsimony sites. Parsimony, distance, and 
maximum -likelihood analyses of the revised ratite data set strongly support the phy- 
logeny of Cooper et al. (1992), as shown in Fig. 13.2. LogDet analysis (Lockhart 
et al., 1994) produces the same phylogeny, demonstrating that the tree topology is 
independent of base composition and lineage evolution rates. 

Two approaches are used to investigate the discrepancies between the 12S phy- 
logeny and those of Cracraft (1974), Bledsoe (1988), and Sibley and Ahlquist 
(1990). First, optimality criteria such as parsimony and maximum likelihood are 
used with the 12S data set to determine how optimal (or suboptimal) the vari- 
ous phylogenetic hypotheses are. The length of the most parsimonious trees, and 
maximum-likelihood values of the various hypotheses, are shown in Table I for 
two data sets. The alternative phylogenies are clearly not supported by either crite- 
ria, with the phylogeny of Cracraft (1974) being significantly worse in maximum- 
likelihood analyses of both data sets. Importantly, the ability of the optimality 
criteria to distinguish between the various phylogenetic hypotheses is severely com- 
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FIGURE 13.1 Approximately 390 nucleotide bases from domain Ш of the mitochondrial 125 gene of eight living, and five extinct, ratites as well as three 
tinamous and a chicken. The sequences correspond to positions 1754-2147 of the published chicken sequence (Desjardins and Morais, 1990), and the stem 
numbers given above the sequences follow the secondary structure model of Hickson et al. (1996). The 366 homologous positions are underlined and dots 
indicate a base identical to the brown kiwi sequence given above. Taxon names: Brown KIWI (Apteryx australis), L. spotted KIWI (A. haastii), Roa KIWI 
(A. owenii), CASSOWARY (Casuaris casuaris), Emeus MOA (Emeus crassus), Pachyorn. MOA (Pachyornis elephantopus), Anomalop. MOA (Anomalopteryx 
didiformis), Dinornis MOA (Dinornis novaezealandiae), Megalap. MOA (Megalapteryx didinus), Common RHEA (Rhea americana), Lesser RHEA (Rhea pennata), 
E. Cr. TINAMOU (Eudromia elegans), Spot. TINAMOU (Nothura maculosa), Tat. TINAMOU (Crypturellus tataupa). 
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FIGURE 13.2 Phylogenetic tree of the ratite and tinamou 12S sequence data in Fig. 13.1, obtained 
from parsimony, neighbor-joining (two parameter with 0.5 gamma distribution, and LogDet correc- 
tions), and maximum-likelihood methods (Swofford et al., 1996) using PAUP* 4.0d 44—51 (Swofford, 
1996). To break the long branch between the ratite and tinamou taxa, the two most divergent tinamou 
taxa (Eudromia and Crypturellus) are used in rooted analyses. The branch lengths are calculated using 
neighbor joining with the LogDet correction and are drawn proportional to evolutionary distance. The 
proportion of invariant sites is estimated to be 0.4 from comparisons of more than 100 divergent avian 
12S sequences (Cooper and Penny, 1997). Maximum-likelihood analyses were used to estimate trans- 
version-to-transition ratios of 6.67: 1 and 4.67: 1 for the ratite, and ratite plus tinamou, data sets, respec- 
tively. Bootstrap values from 1000 unweighted parsimony heuristic replications (in boldface), and 1000 
neighbor-joining (LogDet correction) replications (in italics) of the ratite data set are given under the 
branches, whereas values for the same data set plus the two tinamous are given above the lines. Un- 
weighted parsimony analysis produces 2 shortest trees (differing only in the resolution of a trichotomy 
among the moa taxa) of 123 steps, and the same 2 shortest topologies are obtained if transversions are 
weighted between 2 and 25 times that of transitions. 


promised if the moa taxa are excluded from the analysis. Without the moa taxa the 
only difference between the hypotheses becomes the position of the root (see Cra- 
craft, Chapter 7 in this volume), so all are equally likely in the unrooted data set 
(Table I). When the tinamou outgroups are included, the position of the root pro- 
vides limited resolving power between the hypotheses, but none is significantly 
worse. Consequently, the moa taxa are essential to evaluate the various hypotheses 
and fully resolve the ratite phylogeny. 

A more comprehensive approach to measuring suboptimal signals in a data set is 
to avoid using any optimality criterion. Spectral analyses (Hendy and Penny, 1993; 


TABLE I Phylogenetic Analyses of the Alternative Hypotheses Using Optimality Criteria? 





Ratite data set 








Ratite data set Ratite data set plus two tinamous 
Ratite data set plus two tinamous minus moa taxa minus moa taxa 
Rinite phylogenetic —In L. Parsimony —ln L. Parsimony =, Parsimony —inL. Parsimony 
hypothesis (Tv 7 6.67) Tv21 Tv = 6.67 (Tv = 4.67) Tv=1 Tv = 4.67 (Ту = 6.67) Ту-1 Ту-667 (Tv = 4.67) Ту-і Tv = 4.67 
Cooper et al. (1992) 1159.1 123 (2) 219.4 (2) 1434.9 181 (3) 305.8 (3) 977.99 94 162 1246.9 150 (3) 252.8 (3) 
Cracraft (1974) 1178.7 (3) 132 (5) 234.1 (5) 1463.0 (3) 192 (3) 320.5 (3) 977.99 94 162 1262.9 155 261.4 
Bledsoe (1988) 1170.6 129 (5) 231.1 (3) 1443.4 186 307.1 977.99 94 162 1259.9 154 260.4 
Sibley and Ahlquist (1990) 
, 1170.6 129 (5) 231.1 (5) 1443.4 186 307.1 977.99 94 162 1259.9 154 260.4 
5 1170.6 129 (2) 231.1 (2) 1444.2 187 (3) 308.1 (3) 977.99 94 162 1260.2 154 260.4 





"Parsimony tree lengths and maximum-likelihood values (—In L.) for alternative phylogenetic hypotheses using 12S sequences of the ratites, and ratites plus two 
tinamous. If more than one topology is equally optimal the number is given in parentheses. Transversions (Tv) are either unweighted (Tv = 1) or weighted 6.67 or 4.67 
times (see Fig. 13.2). Kashino— Hasegawa tests (Swofford et al., 1996) of the various trees show that the phylogeny of Cracraft (1974) is significantly worse than that of 
Fig. 13.2 using either the ratite, or ratite plus tinamou, data sets (see entries in boldface, p < 0.05 and p « 0.01, respectively). In contrast, the phylogenies of Sibley and 
Ahlquist (1990) and Bledsoe (1988) are not significantly worse for either data set. If the moa taxa are excluded from the ratite data set, the alternative hypotheses are all 
equally likely, as the only difference between the trees is the position of the root. When the moa taxa are excluded from the ratite plus tinamou data set there is limited 
resolution, but no significant differences between the hypotheses. This demonstrates that the moa taxa are essential to test the various hypotheses properly. 

"Figure 354 of Sibley and Ahlquist (1990); rhea and ostrich as a basal monophyletic clade. 

"Figure 326 of Sibley and Ahlquist (1990); ostrich as the basal divergence within ratites. 
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FIGURE 13.3 Spectral analysis of the 366-bp 128 ratite data set, with taxonomic groupings (splits) 
ranked by the frequency of signals in the data set that support them (above axis, in black), minus those 
that conflict (below axis, in gray). The frequency of a split is the sum of occurrences of that split in the 
data divided by the total number of nucleotides (Lento et al., 1995). Because there are many possible 
patterns in which the data could conflict with a given split, conflict values are normalized so that they 
sum to the same value as the support signals. The taxonomic groupings corresponding to splits are listed 
at top right, with clades in the phylogeny identified by the optimality criteria (Fig. 13.2) underlined. 
There are 4096 possible splits in this data set, and the vast majority have little or no support, so only the 
strongest 14 are shown. Phylogenetic groupings found by the optimality criteria (Fig. 13.2) have sig- 
nals with high support and low conflict values (splits 1—8), in contrast to the taxon groupings in splits 9 
and above. The signals for taxon groupings from the alternative phylogenetic hypotheses are shown as 
splits 16 and 17 (actually 4094 and 4083 by support values, respectively), and have no support and large 
conflict values. 


Hendy et al., 1994; Lento et al., 1995) measure the direct support for every taxon 
grouping (split) within a data set, and present these data independently of any phy- 
logenetic tree. Therefore, this method can measure the amount of support for, and 
conflict against, any given phylogenetic arrangement and is particularly useful when 
several alternative phylogenetic hypotheses are to be compared. Spectral analyses of 
the ratite 12S sequences were performed using Spectrum 1.0.5 (Charleston, 1996) 
on data sets with, and without, two tinamou outgroups (Figs. 13.3 and 13.4 respec- 
tively). The resulting spectrums identify a phylogeny identical to that presented in 
Fig. 13.2 and indicate that the strongest signals (i.e., highest ratios of support to 
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FIGURE 13.4 Spectral analysis of the ratite and tinamou data set with normalized conflict values. 
Only the 15 strongest signals out of the possible 16,384 are given. While the addition of the distant 
outgroup decreases the signal within the ingroups, and increases noise, the clades found by the optimality 
criteria (underlined) are still well supported. Splits 7, 8, and 10 are not seen in Fig. 13.2 but have slightly 
stronger signals than splits 9, 11, and 12, which were identified by the optimality criteria. The anomalous 
taxon groupings are not biologically sensible and presumably occur because of the reduced level of 
resolution. No support can be found for the alternative phylogenetic hypotheses in the data set (splits 
17—20, representing rhea/ostrich monophyly, basal ostrich lineage, and New Zealand ratite monophyly, 
respectively) and they are ranked between 16,289 and 16,366 by support values. 


conflict) are for clades identified by the optimality criteria. As the ratio of support 
to conflict values decrease, other, less likely taxon groupings appear. Because the 
spectrum contains 27! splits (where n is the number of taxa) and the vast majority 
have low support-to-conflict ratios, only the 14 strongest signals are represented. 
When the tinamou outgroups are included in the analysis (Fig. 13.4) the average 
ratio of support to conflict decreases, because the long tinamou branch allows many 
opportunities for convergent (homoplasious) substitutions with the ingroup taxa, 
reducing resolution. 

Spectral analyses permit the direct measurement of the signals for kiwi/moa 
monophyly (Cracraft, 1974), rhea/ostrich monophyly (Bledsoe, 1988; Sibley and 
Ahlquist, 1990), and the arrangement in which ostrich is the sister taxa to other 
ratites (Sibley and Ahlquist, 1990). As Figs. 13.3 and 13.4 show, these signals have 
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almost no support, and high conflict values. To investigate the amount of support 
for these taxon groupings in the data set, all possible splits were ranked by support 
values. Of the 4096 splits possible when only the ratite taxa are considered, the 
signals for kiwi /moa and rhea/ostrich monophyly are ranked 4083 and 4094 re- 
spectively (Fig. 13.3). When the tinamous are included, 16,384 splits are possible 
and the monophyletic groupings, as well as that for the ostrich as a basal lineage, are 
ranked between 16,289 and 16,354 (Fig. 13.4). Therefore, spectral analyses dem- 
onstrate both that the phylogeny in Fig. 13.2 is well supported, and that there is no 
direct support in the data for any of the alternative phylogenetic hypotheses. 


2. New Ratite Molecular Datasets 


Ratite sequence data were obtained from the mitochondrial ND6/transfer RNA- 
proline (ND6/tRNA””) intragenic region and the nuclear c-mos protooncogene (с- 
mos), using the same techniques and DNA samples described above. Polymerase 
chain reaction primers and conditions are given in Cooper and Cooper (1995) and 
Cooper and Penny (1997). 

The ND6/tRNA” intragenic regions of the extant ratite genera and three moas 
are shown in Fig. 13.5. The basal state observed in the outgroup galliform and 
passerine is an intragenic spacer sequence of 6—7 bp, typical of avian mitochondrial 
intragenic regions (Desjardins and Morais, 1990). The intragenic regions of the 
ostrich, moas, and rhea are similar, or slightly bigger, while the kiwi, emu, and 
cassowary intragenic regions all have a large insertion, ranging from 20 to 30 bp. 
The intragenic region may still be expanding in the kiwi species; the brown kiwi 
(Apteryx australis) has a 3-bp insertion relative to the other two kiwi species. The 
insert is strongly biased toward G and against C, with the 30-bp brown kiwi in- 
sertion being 57% G and 0% C. The ratite intragenic regions correlate well with 
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FIGURE 13.5 Sequences between the NADH subunit 6 (ND6) and transfer RNA proline (tRNA””) 
genes in the extant ratite genera, three moa taxa, and the chicken and wren outgroups. Mitochondrial 
heavy-strand sequences (corresponding to positions 16,151--16,207 of the published chicken sequence; 
Desjardins and Morais, 1990) are shown. The translated brown kiwi ND6 amino acid sequence is given 
("Translation") with* as the termination codon. Taxon names are given in Fig. 13.1, except for rock 
wren (Xenicus gilviventris). 
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FIGURE 13.6 Aligned sequences of a 660-bp region of the protooncogene c-mos for the extant ratites and a chicken. The chicken amino acid sequence (“Trans- 
lation") is shown above the DNA sequences. The underlined 657 homologous positions are used in the phylogenetic analysis. There are 105 variable, and 20 
parsimony, sites in the data. Taxon names are given in Fig. 13.1. The proportion of invariant sequence positions was estimated to be 0.4, on the basis ofthe number 
of leucine and third codon positions, and variation observed in 10 avian orders (Cooper and Penny, 1997). 
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the 12S phylogeny, since the derived ratites (kiwi, emu, and cassowary) all possess 
a derived insert relative to the other ratites and outgroups. However, the nature 
and distribution of this character state contrast strongly with a monophyletic kiwi / 
moa clade. 

'The c-mos protooncogene is a single-copy intronless nuclear gene that encodes 
Mos, a serine/threonine kinase with important oocyte maturation-controlling 
functions (Sagata et al., 1988). Sequences of a 657-bp fragment of the ratite c-mos 
are given in Fig. 13.6. It was difficult to obtain sequences for the tinamou taxa and 
consequently the chicken sequence is used as an outgroup for the analysis. Unfor- 
tunately, it was also not possible to amplify a moa c-mos sequence using PCR, pre- 
sumably owing to the low concentration of any surviving single-copy c-mos se- 
quences. In contrast, relatively large amplifications (600—800 bp) were obtainable 
from museum specimens up to 20 years old (data not shown). The ratite nuclear c- 
mos sequences were used for phylogenetic analyses separately, and in combination 
with the mitochondrial 12S data. 

Phylogenetic analysis of the c-mos data (Fig. 13.7) reveals that the rhea is the 
basal divergence within ratites, whereas the kiwi, ostrich, and emu/cassowary line- 
ages form an unresolved trichotomy. The topology is consistent with Fig. 13.2, 
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FIGURE 13.7 Unweighted LogDet corrected distance tree of the c-mos data (Fig. 13.6). The tree 
topology is consistent with Fig. 13.2, suggesting the rhea is the basal ratite lineage. The long branch 
joining the rhea to the ancestor of the remaining ratites suggests that a long period of time existed 
between the initial split within ratites, and a subsequent radiation. The most parsimonious tree (not 
shown) joins the ostrich, kiwi, and emu/cassowary lineages as an unresolved trichotomy. Bootstrap 
values from 1000 heuristic replications using parsimony (above line) and LogDet corrected neighbor 
joining (below line) are shown. 
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FIGURE 13.8 Unweighted LogDet corrected distance tree of the combined nuclear and mitochon- 
drial data sets (1023 bp). The same tree topology is produced by weighted and unweighted parsi- 
mony and maximum-likelihood analyses, and is identical to Fig. 13.2. Bootstrap values from 1000 heu- 
ristic replications using parsimony and LogDet corrected neighbor joining are given above and below 
branches, respectively. 


although there is less phylogenetic resolution, consistent with the slower evolution- 
ary rate of the nuclear gene and the use of a distant outgroup. The long branch 
separating the rhea from the ancestor of the other ratites indicates that a consider- 
able period of time elapsed between the first divergence within ratites and a subse- 
quent radiation. It would be interesting to include the moa in this phylogeny to 
determine whether the early separation of South America and New Zealand from 
the other Gondwanic land masses might correspond to early rhea and moa diver- 
gences relative to the other taxa. 

A combined nuclear and mitochondrial data set of 1023 bp was created for 
the rhea, ostrich, cassowary, emu, three kiwis, and chicken taxa for which both 
128 and c-mos sequences had been obtained. Parsimony, distance, and maximum- 
likelihood analyses of the data set (Fig. 13.8) identify the same phylogeny as in 
Fig. 13.2, with similar levels of support. None of the alternative phylogenies are 
statistically worse than this topology. This is not surprising because the moa taxa are 
missing from the phylogeny and a distant outgroup is used, which severely reduces 
resolution with respect to the 12S data set. Spectral analyses (data not shown) dem- 
onstrate that of the 128 possible splits, a basal kiwi lineage is ranked 40 by support 
values, while a basal ostrich lineage and rhea/ostrich monophyly are ranked 13 and 
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23, respectively. Consequently, even analyses with reduced levels of resolution 
demonstrate the phylogeny of Cracraft is poorly supported. 


3. Why Do the Phylogenies Differ? 


Phylogenetic analyses of the mitochondrial and nuclear data sets strongly support 
the topology presented in Fig. 13.2, and indicate that not only are the phylogenies 
of Cracraft (1974), Sibley and Ahlquist (1981, 1990), and Bledsoe (1988) subopti- 
mal, but that there is absolutely no support in the data set for the taxon groupings 
in which they differ from Fig. 13.2. It is important to examine the sequence data 
for systematic biases that could obscure signals for alternative phylogenies, although 
in this case the signals would have to be removed, rather than just obscured. No 
significant evolutionary rate variation was observed in the data, and LogDet analyses 
indicate that the topology in Figs. 13.2, 13.7, and 13.8 is independent of base com- 
position. To detect biases due to long-branch attraction (Hendy and Penny, 1989), 
analyses were carried out in which ingroup taxa were excluded in turn from the 
data set to see if the tree topology changed. For example, if the emu and cassowary 
are left out of the 12S data set, the kiwi branch becomes longer than those of the 
rhea, moa, or ostrich but still the kiwi does not shift from its derived position in the 
phylogeny to move closer to the tinamou outgroup. No long-branch attractions 
were detected in the mitochondrial or combined mitochondrial and nuclear data 
sets, using this method. 

Because no shortcomings are apparent in the sequence data or analyses, it is nec- 
essary to examine the discrepancies between the alternative phylogenies and Figs. 
13.2 and 13.8. Sibley and Ahlquist (1981, 1990) and Bledsoe (1988) identified the 
kiwi, emu, and cassowary as a derived clade, but differ from Fig. 13.2 in how well 
they resolve the position of the rhea lineage. Sibley and Ahlquist (1981, 1990; see 
Cracraft, Chapter 7 in this volume) use the same data to obtain three different out- 
group combinations of the rhea and ostrich lineages, suggesting the data lack re- 
solving power among the deeper divergences. Two of the phylogenies are identical 
to Fig. 13.2 if basal branches are collapsed (e.g., owing to а lack of resolution), while 
the conflicting phylogeny weakly places the ostrich as the basal lineage. Bledsoe 
(1988) tentatively placed the moa taxa as the basal lineage, but suggested this could 
be an artifact. Otherwise the phylogeny of Bledsoe is identical to Fig. 13.2, except 
that once again the long rhea and ostrich branches are joined, perhaps owing to a 
long-branch attraction. Consequently, the discrepancy between the DNA-DNA 
hybridization, morphological data, and Fig. 13.2 is easily explained as a difference 
in resolution of the basal rhea and ostrich lineages. The resolving power of the 12S 
data, provided by the moa taxa splitting the long rhea and ostrich branches, is a clear 
example of the value of ancient DNA sequences to systematic research. 

The phylogeny of Cracraft (1974) is widely divergent from those of all the other 
studies, including Fig. 13.2. Analysis of the 12S sequence data shows the phylogeny 
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to be significantly worse than that of in Fig. 13.2, and none of the data sets have any 
support for a basal kiwi, or monophyletic New Zealand ratite clade. Furthermore, 
the ND6/tRNA"" insertion is difficult to reconcile with the topology of Cracraft. 
Subsequent publications (Sibley and Ahlquist, 1981; Bledsoe, 1988) have criticized 
Cracraft's character measurements, and the unusually small amount of homoplasy of 
the data set, especially given the convergence expected in a group of large flightless 
birds, many of whom exploit similar habitats. Bledsoe (1988) used most ofthe mor- 
phological characters measured by Cracraft and produced a widely divergent phy- 
logeny that is similar to those of the genetic studies. The two different inter- 
pretations of what is essentially the same morphological data strongly suggest that 
subjective decisions about character states and polarity are influencing the mor- 
phological phylogenies. In contrast, the DNA-DNA hybridization and sequence 
studies utilize completely different, and objective, data to yield closely matched 
topologies. As further DNA studies support the same topology, the confidence in 
its accuracy should increase correspondingly. 


B. Paleoecological Studies 


Owing to the repetitive destructive tendencies of humans, the Pacific Islands are an 
area in which ancient DNA research is particularly suited. Conservation manage- 
ment is critical in attempts to halt the decline of biodiversity and ecosystem health 
in these environments. Unfortunately, many of these islands have insufficient pale- 
ontological information to form accurate views of the evolution and processes of 
past paleoecosystems. When bones and pollen are preserved in deposition sites, 
caves, or lava tubes, the picture that is produced is often far more complex than 
expected (James et al., 1987; Diamond, 1990; Olson and James, 1991; James and 
Olson, 1991; Cooper and Millener, 1993; Worthy and Holdaway, 1993; Steadman, 
1995; Cooper et al., 1996). In these situations, ancient DNA sequences can provide 
important information about extinct and endangered taxa, including phylogenetic 
relationships and changes in genetic diversity and gene flow between populations 
through time. The following two projects demonstrate how ancient DNA data can 
have quite different practical applications for surviving island endemics. 


1. The Oligocene Drowning of New Zealand 


New Zealand separated from the remnant Gondwana land mass around 80 MYA 
and carried a range of Gondwanic biota, augmented by wind-blown additions, to 
a position 1500 km from the nearest land mass (Cooper and Millener, 1993). Paleo- 
ecological “ghost” signals abound in New Zealand (Diamond, 1990) but the lack 
of a vertebrate terrestrial fossil record prior to the Pliocene (Fordyce, 1991) has 
constrained temporal aspects of evolutionary research. Ancient DNA studies are 
particularly useful in island situations such as this, in which a variety of morpho- 
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logically unique taxa lack paleontological records and therefore have poorly known 
evolutionary histories. 

As discussed previously, New Zealand holds a central position in the evolution 
of several avian groups. In turn, the ecology of New Zealand has been shaped to a 
large degree by avian evolution. Because New Zealand lacked endemic terrestrial 
mammals, many typical “mammalian” niches were filled by birds, reptiles, and in- 
sects (Daugherty et al., 1993). The dominant herbivore in New Zealand was un- 
doubtedly the moa, as this giant nonruminant required large amounts of vegetation 
in the temperate climate. Moa browsing has been suggested to have exerted large 
selective effects on the growth patterns of New Zealand flora (Atkinson and Green- 
wood, 1989; Cooper et al., 1993). The extinction of the moa has even been hy- 
pothesized to have changed the forest ecology of New Zealand from one of gymno- 
sperm, to angiosperm, dominance (Wellman, 1994; Cooper, 1994). Consequently, 
the population history of the moa, and possibly also that of the kiwi, are important 
factors in evaluating the paleoecological interactions of New Zealand biota. 


a. Molecular Studies 


The ratite 12S sequences in Fig. 13.1 indicate that the five moa genera, which 
are morphologically quite diverse, possess a surprisingly limited amount of genetic 
diversity. Furthermore, the three kiwi species show a similarly limited amount of 
diversity. To contrast this pattern with that from a more rapidly evolving sequence, 
a 244-bp region of the ND6 gene was sequenced for the kiwi and moa taxa as well 
as three New Zealand acanthisittid wrens, one of which is extinct (Cooper and 
Cooper, 1995). Surprisingly, the maximum genetic diversity within ND6 sequences 
of each group (moa, kiwi, and wren) was similar (range, 0.254—0.377; Cooper and 
Cooper, 1995), and repeated the pattern observed in the 12S sequences (range, 
0.084 —0.106). This situation is difficult to reconcile with hypotheses suggesting the 
three groups arrived and radiated in New Zealand at different times, with the kiwi 
a recent, perhaps early Tertiary, dispersal from Australia (Sibley and Ahlquist, 1981; 
Cooper et al., 1992) while the moa and wren are ancient, possibly Gondwanic, 
lineages (Fleming, 1979). A further problem is that there are several deep splits 
within each group and the observed 12S and ND6 sequence diversity in the three 
groups is similar to that of avian taxa thought to have radiated as recently as mid- 
Oligocene to mid-Miocene times (Moum et al., 1994). Therefore the sequence data 
appear to indicate that each of these ecologically diverse bird groups radiated at a 
common, possibly Oligocene to Miocene, point in time. 

Because these results are unexpected, it is important to reevaluate the authentic- 
ity of the ancient moa and wren sequences. The six moa taxa have different, but 
closely related, sequences and several individuals of most taxa have been sequenced 
(Cooper et al., 1992; Cooper and Cooper, 1995). Although no intraspecific 12S 
variation 1s seen, small amounts of ND6 sequence variation corresponding to geo- 
graphic patterns are seen within species (data not shown). Moa and wren interspe- 
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cific variation occurs at positions known to be variable in 12$ and ND6 sequences, 
and the ratio of transversion to transition substitutions is consistent with mitochon- 
drial patterns. The sequences have been replicated in four physically separate labo- 
ratories, and also match the pattern observed in the living kiwis. Last, phylogenetic 
analyses cluster the moa sequences, and the wren sequences, in 2 clades when com- 
pared to sequences from 15 other avian orders (data not shown). 

Since the ancient sequences appear authentic, other explanations are needed. 
The similar, and limited, amounts of genetic diversity could result from indepen- 
dent Oligo – Miocene arrivals of the three groups in New Zealand, presumably 
from Australia, but this is unlikely for several reasons. The ratites are flightless (the 
moa has totally lost its wings) and the wrens are barely flighted, and no closely 
related Australian ancestral population exists for the moa or wren, while the kiwi 
appears to have diverged from the emu/cassowary lineage in the Eocene (Sibley and 
Ahlquist, 1981). This scenario also demands that all three groups would bave to 
arrive independently, but almost simultaneously. 


b. The Oligocene Marine Transgression 


Another explanation of the data is that while the moa, kiwi, and wren arrived 
and radiated independently in New Zealand, some relatively recent event reduced 
diversity in each group to a single mitochondrial lineage, from which there has been 
a subsequent radiation. This would conceal any previous diversity, and the groups 
would appear to have simultaneously radiated. Mitochondrial DNA is particularly 
sensitive to population size fluctuations and quickly loses diversity during periods 
of constant, or decreasing population size (Wilson et al., 1985). Consequently, the 
sequence data are compatible with a widespread ecological event that could simul- 
taneously constrain, or drastically reduce, the population size of the giant herbivo- 
rous moas, tiny insectivorous wrens, and nocturnal omnivorous kiwis. 

The Tertiary geological record of New Zealand is briefly reviewed in Cooper 
and Cooper (1995) and reveals that a lack of tectonic activity in the Paleogene 
(65-23 MYA) reduced New Zealand to a broad lowland. During the large sea level 
changes of the late Oligocene (29-23 MYA), New Zealand was inundated by a 
marine transgression that is thought to have lasted some 6 million years, reducing it 
to a string of low-lying islands. The peak of this drowning event is estimated to have 
reduced the land area to about 18% of the current size (Cooper and Cooper, 1995). 
In the absence of a fossil record, the transgression has been hypothesized to have 
stimulated speciation (Stevens, 1985), although the exact effects of the large reduc- 
tion in niche diversity and population sizes were unknown. The sequence data ap- 
pear to remedy this situation, as it correlates perfectly with this geological catastro- 
phe, both in timing and because mitochondrial diversity would be severely reduced 
by a prolonged period (up to 6 million years) of limited population sizes and local 
extinctions. Furthermore, the subsequent star-like radiation of mtDNA diversity is 
consistent with a Miocene increase in land area and niche diversity. Consequently, 
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the combined genetic and geological data suggest a severe loss of endemic taxo- 
nomic diversity in the mid- Tertiary, providing an important new view of a cata- 
strophic period in the history of New Zealand (Cooper and Cooper, 1995). 

Importantly, the Oligocene drowning hypothesis is testable because it predicts 
which habitats and taxa would have been most adversely affected during the marine 
transgression. Studies of nuclear sequence data in New Zealand endemics will en- 
hance the view of this event, but if it was as ecologically widespread as the mtDNA 
data suggest then conservation studies will need to accommodate the model when 
interpreting the genetic diversity of endemics. 


2. Conservation of the Endangered Laysan Duck 


Ancient DNA data have considerable potential to provide information about the 
evolutionary history of extant, as well as extinct, taxa. Genetic data from subfossils 
(preserved nonmineralized bone) can be used to analyze the prior range and habitat 
of taxa with recently restricted distributions. This is important in areas like the 
Pacific, where many flighted taxa currently endemic to islands are relics of formerly 
widespread populations (Steadman, 1995). 

Laysan Island is one of the many eroded islands that has formed as the Pacific 
plate moves across the Hawaiian hotspot, and currently lies some 600 km to the 
northwest of the main Hawaiian islands. It is only 370 ha in size with a maximum 
altitude of 12 m, and is dominated by a large central hypersalinic lagoon. The brine 
flies that live on the lagoon are the main food source of the last remaining popula- 
tion of Laysan ducks (Anas laysanensis). This endangered population has varied in 
size from 500 to less than 20 individuals during this century and is highly vulnerable 
to disease or climatic disruptions. Because the Laysan duck is historically known 
only from Laysan Island, it has been difficult to gain permission to establish a second 
population elsewhere, given the historically negative effects of introduced taxa in 
the Hawaiian islands. In addition, the Laysan duck and the Hawaiian duck, or koloa 
(Anas wyvilliana), an inhabitant of wetlands on most of the main Hawaiian islands, 
are thought to have evolved from stray migratory mallards (Anas platyrhynchos), 
which has greatly influenced recovery programs (Moulton and Weller, 1984). 

Paleontological studies have shown that fossil and subfossil bones of small duck 
species occur in late Pleistocene and Holocene deposits on the main Hawaiian is- 
lands (Olson and James, 1991; Giffin, 1993). Interestingly, the bones are found in 
association with a variety of paleontological habitats and indicate that the fossil spe- 
cies was widely adaptable. The bones are intermediate in size between the koloa 
and Laysan duck and do not provide sufficient information for identification pur- 
poses (Cooper et al., 1996), because morphology and body size are not particularly 
diagnostic in dabbling ducks (Worthy, 1988; Livezey, 1991). 

If the paleontological duck bones on the main Hawaiian islands are part of the 
former range for the Laysan duck, then considerable data about the paleoecology 
and evolution of the species might be gained. The information would be important 
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in potential relocation plans because it might identify whether the habitat of Laysan 
Island was optimal for the species. Consequently, the identity of the bones was 
crucial, and in the absence of definitive morphological data it appeared that ancient 
DNA techniques might resolve the issue. 


a. Phylogenetic Analysis of Subfossil Ducks 


DNA was extracted from femurs and tibiotarsi of subfossils from lava tubes on 
the island of Hawaii. Two variable regions of the control region, spanning 312 and 
133 bp, respectively (positions 78—390 and 1117—1251 in the published chicken 
sequence; Desjardins and Morais, 1990), were amplified and sequenced using PCR. 
Interestingly, DNA could not be amplified from subfossils found in low-altitude 
sites (0—500 m) while those at high-altitude sites (around 2300 m) worked well. 
This correlation was also found in studies of other Hawaiian avian subfossils, even 
in high-altitude sites that were regularly wet, and is thought to relate to cold tem- 
perature. 

The mt control region sequences of three subfossil bones, three Laysan ducks, 
three koloas, two genetically diverse mallards, and an outgroup (African black duck, 
Anas sparsa) were aligned and 366 homologous positions identified. Of these, 60 
were variable and 36 were informative among the ingroup taxa (Fig. 13.92). The 
results of phylogenetic analyses of the data are shown in Fig. 13.9b. The sequences 
clearly demonstrate the subfossil taxa are closely related to the extant Laysan ducks, 
differing only by one transition. The long branch between the Laysan duck/sub- 
fossil clade and any other taxa indicates they are not genetically closely related to 
either mallard sequence. Conversely, the koloa taxa form a clade with mallard hap- 
lotype 2, indicating that the migratory mallard ancestry hypothesis may be correct, 
or that some degree of hybridization has taken place. Interestingly, mallard haplo- 
type 1 does not group strongly with the mallard 2/ koloa clade, indicating that con- 
siderable mtDNA genetic diversity exists in the mallard, as previously noted (Avise 
et al., 1992). 

The analysis shows that the Laysan Island population is a relict of a formerly 
widespread distribution, and provides justification for reestablishing populations of 
Laysan duck on the main Hawaiian islands. The ecological situations that currently 
exist on the main Hawaiian islands are obviously vastly altered from that of the 
former population, and this must be taken into account when reintroducing a spe- 
cies. However, the widespread distribution of paleontological remains indicates that 
the species was surprisingly adaptable, inhabitating high-altitude (up to 1800 m) 
forested sites far from water as well as sites near sea level (Cooper et al., 1996). 
Furthermore, the considerable numbers of seabirds that frequent Laysan Island have 
probably introduced, and stimulated immunity to, many avian diseases that have 
decimated other Hawaiian endemics. Whether sufficient genetic variability remains 
in the population after the many bottlenecks experienced on Laysan Island is 
currently unknown. Nevertheless, the establishment of any significant breeding 
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FIGURE 13.9 (2) Variable positions from two portions of the mt control region of three Laysan ducks, 
three subfossil bones, three koloas, two divergent mallard haplotypes, and an outgroup African black 
duck (Anas sparsa) from Cooper et al. (1996). A total of 366 bp was obtained from the 5' and 3' variable 
regions of the control region. The numbering system uses the 3’ base of the light-strand primers as 0 for 
each fragment. Taxon names are given in text. (b) Unweighted LogDet corrected neighbor-joining tree 
using gaps as characters with branch lengths drawn proportional to distance. The partial sequence from 
subfossil bone 3 was excluded from analysis. Bootstrap values from 1000 replications are given for both 
parsimony and neighbor joining, above and below branches, respectively. When subfossil bone 3 was 
included in the analysis, it joined the Laysan duck/subfossil bone clade immediately prior to the diver- 
gence of subfossil bone 2. 
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population outside Laysan Island must represent a considerable reduction of the 
threat of immediate extinction. 


IV. SUMMARY 


The three projects described above demonstrate how genetic information from the 
past can be incorporated into modern evolutionary studies. In phylogenetic studies 
of the ratite birds, ancient DNA sequences from moa taxa drastically increased reso- 
lution among the basal lineages by splitting a long branch and improving taxon 
distribution across the tree. The moa sequences were also essential in demonstrating 
that several alternative phylogenetic hypotheses are not supported by the sequence 
data. Without the moa data the alternative hypotheses differed only in the position 
of a root, and a monophyletic moa and kiwi clade could not be tested. 

The ratite data led to a further discovery in New Zealand, where the combina- 
tion of geological data and ancient DNA sequences revealed the extent of a mid- 
Tertiary ecological disaster, previously concealed by a missing paleontological rec- 
ord. In both the New Zealand and Hawaiian studies, genetic data from recently 
exterminated taxa provided temporal information about paleoecosystems that have 
been severely disrupted. In so doing, the studies allowed the ecology of modern 
taxa to be reinterpreted in the light of paleoecological data, providing information 
for the conservation of surviving taxa. While both studies focused on avian taxa 
they serve as models for the investigation of many other island endemics or isolated 
populations. 


V. FUTURE RESEARCH 


The future of ancient DNA in systematic research is currently difficult to predict, 
as the increasing availability of automated sequencers means that studies involving 
many taxa can now realistically use sequences of thousands of base pairs. Whole 
mitochondrial genomes are now routinely used in vertebrate systematic studies 
(e.g., Horai et al., 1995; Xu et al., 1996) and sequences of this length hold consid- 
erable advantages for phylogenetic reconstruction (Charleston et al., 1994). As in- 
creasingly long DNA sequences become standard in systematic studies of extant taxa 
it will become correspondingly difficult to obtain ancient DNA sequences of a simi- 
lar size. The use of DNA repair systems to increase the length of ancient DNA 
amplifications (Lindahl, 19932) may improve the situation slightly, but this problem 
is only likely to grow. 

In contrast, ancient DNA studies that investigate the identity, or genetic diver- 
sity, of extinct or preserved taxa are likely to become more prevalent. The potential 
of ancient microsatellite data is still being explored, but this new area will undoubt- 
edly increase the scope of genetic studies of extinct populations in the next 10 years 
of ancient DNA research. 
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Accipitridae, 237—241 
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Adélie penguin, see Pygoscelis adeliae 
Aepyornis maximus, 350 
African black duck, 366 
Agelaius phoeniceus, 268, 305, 315, 317—318 
Ailuroedus melanotus, 91 
Alta torda, 54 
Alignment, 11, 220—222 

ambiguous, 180 

phylogenetic weighting, 129—130 
Alligator mississippiensis, 219 
Allozymes, 36, 43, 69, 302—303, 313 
ALLTOPS, 133 
Altitudinal replacement, 326, 334 
American robin, 301 
Amino acid substitutions and taxonomic diver- 
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Anas acuta, 54 
А. laysanensis, 365 
A. platyrhynchos, 54, 365 
А. sparsa, 366 
A. wyvilliana, 365 
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Andropadus, 335—338 
Anhima cornuta, 177, 234—235 
Anhinga anhinga, 164 
Anser caerulescens, 18—20, 52, 54, 77 
Anseranas, 235 
Anseriformes, 180, 214, 225—228, 233-235 
Anthropoides, 122, 130, 134 
A. virgo, 127-128 
Apteryx, 174, 177, 350, 357, 363 
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A. australis, 54 

Aptornis, 123, 128, 141, 142 
Aramus, 122, 127 
Ardeinae, 283 

Ardeotis, 122, 127 

Arenaria interpres, 54 
Aristotle, 163 
Aulacorhynchus derbianus, 91 
Auriparus flaviceps, 308 
Aythini, 19 

Aythya americana, 219, 226 


Balaeniceps rex, 162—164, 168 

Balearica, 95, 122, 127, 128 

Barbets, 91 

Barriers, 269, 304, 308, 309 

Base composition 
and transition-transversion ratios, 97 
and variation in mutation rate, 217 
bias in 125 rDNA, 224-225 
bias in avian cytochrome b, 92-94 
bias in avian MHC class И genes, 272-273 
bias in birds, 11 
in control region, 57, 78 

Beringia, 309 

Biogeography, see also Phylogeography 
African, 330, 334—339 
Andes, 333—334 
and DNA hybridization data, 326 
colonization routes, 71—72 
dating events, 267, 269 
South American, 328—330 
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Birds of paradise, 90—91 

Black-tailed gnatcatcher, 308 

Blue chaffinch, 78 

Bonasa, 234—235 

Bottlenecking, 69 

Bowerbirds, 91, 290 

Brachyrhamphus marmoratus, 309 

Brambling, 78 

Branch lengths, 102-104, see also Long branch 
attraction 

Branta canadensis, 54, 73—74, 315, 317, 318 

Bubulcus ibis, 283 

Burhinus, 140 

Bustards, see Ardeotis 

Buttonquail, see Turnix 


c-mos proto-oncogene, 357 
Caching, 289-291 
Cactus wren, 308 
Cairina moschata, 54 
Calidris, 268 
C. alpina, 38, 52, 66—69, 77 
C. canutus, 69-70 
California gnatcatcher, 309 
Campephilus, 92, 108 
Campylorhynchus brunneicapillus, 308 
Canada goose, see Branta canadensis 
Canis, 38 
Canyon towhee, 308 
Capito, 91, 100 
Carduelis chloris, 78 
Cariama, 122, 127, 128 
Carpenterian barrier, 74, 269 
Casuarius, 177 
Cathartidae, 167 
Cattle egret, 283 
Cepphus grylle, 54 
Chaffinches, see Fringilla 
Character 
correlation, 287—288 
displacement, 71—72 
loss, 169 
sampling, 218—219 
transformation, 287 
weighting, 12, 14, see also Weighting 
Character-state polarity, 180—181 
Charadriiformes, 123, 124 
Chauna chavaria, 234—235 
Chipping sparrow, see Spizella passerina 
Chiroxiphia linearis, 41 —42 


Chromatograms, 220 
Ciconia nigra, 164 
Cnemidophorus, 8 
Codon composition, 217 
Coevolution, 320 
COI gene, see Cytochrome oxidase I 
Colaptes, 92, 95, 100, 103, 104 
Columba inornata, 54 
Combining data sets, 218 
Communities 
‘hotspot’, 332 
species composition, 314 
stability, 320 
Comparative method, 279, 285—291 
Comparative phylogeography, 314—315 
Congruence 
character, 282 
DNA hybridization and mitochondrial gene 
sequences, 191 
taxonomic, 282 
Consensus approach, 282 
Conservation, 37—38, 325, 362 
Conserved sequence blocks, 58 
Constraints on molecular evolution, 215-218 
Contamination, 87, 347, see also Polymerase chain 
reaction, amplification of contaminants 
Control region, 51-78 
base composition, 57, 78 
domains, 58 
incomplete lineage sorting, 264 
in subfossils, 366 
interspecies variation, 61, 64 
intraspecies variation, 64 
On, 59 
secondary structure, 8 
substitution dynamics, 259 
tandem repeats, 8, 61 
transcriptional promoter, 10, 58, 61 
Convergent evolution, 70 
Cooperative behavior, 41—42 
Cooperative breeding, 74-75, 292 
Corvidae, 290 
Coturnix, 235 
Cranes, 91—92, 107, see also Grus, Anthropoides, 
Balearica 
Cranioleuca, 339 
Crax mitu, 177 
Crocodylia, 6 
mt O,, 14 
pseudogene of tRNA’, 16 
sister to birds, 13-15 
Crowned crane, 147 


Crypturellus, 176, 225, 351 

a-Crystallin A gene, 227 

Cuculiformes, 230—231 

Cuculus canorus, 37 

Curve-billed thrasher, 308 

Cyanocitta cristata, 101 

Cytochrome b, 83-109, 176, 253, 255-260 
and biogeography, 334, 336 
compared to control region, 66, 73 
in reptiles, 15 

Cytochrome oxidase I, 92, 176, 219 
initiation codon, 11 
substitution rates, 225 

Cytochrome oxidase II, 176 


Data partitioning, 132—133, 218 

Dendrocygna, 234-235 

Deserts as barriers, 308 

Dinornis, 176, 350 

Dinornithids, 182—183 

Dinosaur, 15 

Diomedea immutabilis, 164 

Diphyllodes magnificus, 91 

Directional mutation pressure, 259 

Diseases, 366 

Dispersal, 311, 339 

Divergence time estimates, 223-224, 265 

Diversity index, 133 

D-loop, see Control region 

DNA extraction, 177, 220, 348 

DNA hybridization, 162, 174—175, see also 

AT H 

problems with, 125, 214—215, 327 
used to study speciation, 326-329 

DNA preservation, 347 

Dromaius novaehollandiae, 177 

Dryocopus pileatus, 92, 100, 103, 104 


Ecomorphology, 279 
Ecophylogenetics, 279 

Effective population size, N., 268—269 
Elephant birds, 350 

Emeus crassus, 176 

Empidonax minimus, 90, 95, 101 
Endosymbiosis, 4 

Epimachus fastuosus, 91 

Error, systematic, 187 

Escherichia coli, 8 
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Eudromia elegans, 176, 177 
European dunlin, see Calidris alpina 
Euryapteryx, 176 

Eurypyga, 122 

Extinction, 267 


Falco peregrinus, 219, 222, 226 
Falconiformes 

as outgroup, 128 

relationships among, 235-241 
Finfoot, see Podica 
Fingerprinting, 31 
Fixation, 217 
Flamingoes, see Phoenicopterus ruber 
Flickers, 92 
Fox sparrow, see Passerella iliaca 
Fregata magnificens, 164 
Fringilla, 54, 70—73, 78 
Fringillidae, 53 
Fœ 40, 42, 75, 269, 311-313 


Galliformes, 53, 180, 214, 225, 233-235 
Gallinula, 122 
Gallus, 95, 128, 130, 219, 226 
Gamma distances, 133, 134 
Gene 
conversion, 272 
deletion, 8-10 
duplication, 7-10 
linkage, 291 
rearrangement, 6-7 
Gene flow 
assessment using microsatellites, 37 
effect on gene trees, 263 
estimating, 310—311 
in chaffinches, 70 
in dunlins, 67 
in gray seals, 42 
in Pomatostomus, 75 
male-biased in geese, 74 
overcome by selection, 73 
preventing divergence, 269, 317 
Gene trees 
and subspecies, 69, 73, 106 
matriarchal, 74 
vs. species trees, 11, 84-85, 128, 262—267 
Genetic drift, 40—41, 70, 263, 287 
Genetic variance, see Fa, Gy 
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Genomic library, 32-33 
Genotyping, 33-34 
Geographic genetic variation, 301—321, 338 


Gray-crowned babbler, see Pomatostomus temporalis 


Greenbuls, 335 

Greenfinch, 78 

Gruiformes, 121—158 

Grus, 92, 105, 122, 128, 130 
Gy, 311—312 


Hardy- Weinberg expectation, 40 
Hawaiian duck, 365 
Heliornis, 122 
Hemipodes, see Turnix 
B-Hemoglobin, 13 
Herons, 123, 125, 139 
Heteroplasmy, 8, 53, 128 

in tuatara, 10, 15 

in Rallus, 147 
Heterozygosity, 69, 302—303 
Heterozygote 

advantage, 271 

deficit, 40 
Hirundo rustica, 293 
Histone, H2B, 13 
Historical ecology, 279—285 
Hoatzin, see Opisthocomus hoazin 
Homoplasy, 12, 150—152, 218 
Hybridization, 266, 267 
Hydrobatidae, 167 
Hydropathy profile analysis, 16 
Hypervariability, 53, 75 


Ibises, 123 

Inclusive fitness, 41 

Index of dispersion, 104—105 

Initiation codon, in COI, 11 

Intergenic spacers, 6, 8 

Internode lengths and phylogenetic resolution, 
102—104, 187, 214 

Interspecific divergence, d, 268 

Introns, 6 

Islands as barriers, 309 


Kagu, see Rhynochetos 
Key innovations, 288-291 


Kites, 240 
Kiwis, see Apteryx 
Knot, see Calidris canutus 


Larus, 54, 127, 292 

Laterallus, 122, 127 

Laysan duck, 365 

Le Conte’s thrasher, 308 

Leipoa ocellata, 177 

Lepidosauria, 14 

Leptopogon, 333 

Limpkin, see Aramus 

Lineage sorting, 85, 253, 262-267, 304 
and internode length, 128 
and subspecies, 70 

LogDet analysis, 351, 361 

Long branch attraction, 185, 190 
in tinamous, 189, 356 
problem, 14, 189, 219, 227, 229 
in ratites, 361 

Lybius bidentatus, 91 


Macroevolution, 251 
Magpie goose, 235 
Major histocompatibility complex (MHC), 267 
Mallards, 365 
Manakins, 41, 288 
Manucodia keraudrenii, 91 
Mapping genetic traits, 31 
Marbled murrelet, 309 
Mate choice, 295 
Mating systems, 36, 41 
Maximum likelihood, 133—134, 226-227, 258 
Megalapteryx didinus, 176 
Megapodidae, 292 
Megapodius freycinet, 177 
Melanerpes carolinus, 88 
Melanerpines, 92 
Meleagris, 234—235 
Melospiza melodia, 315, 317, 318 
Mesitornis, 122, 127 
Microevolution, 251—274 
Microsatellites, 29— 43, 70, 264, 320 
from museum specimens, 342 
in Pomatostomus, 54 
Minisatellites, 31 
Misidentification of specimens, 73, 128 
Mismatch distributions, 75, 318 


Mitochondrial DNA 
advantages, 5, 84 
gene flow, 310 
gene order in birds, 6 
genomes per cell, 5 
haplotypes, 20, 73, 303—305 
haplotype tree 
and geography, 303, 309—310, 315 
vs. nuclear-gene tree, 107 
insertions into nuclear genome, 17—22, 220 
Moas, 174, 191, 349, 350, 363 
Molecular and morphological data conflict, 169, 
175, 187, 191—192 
Molecular clock, 104, 107, 268, 317 
and refuge theory of speciation, 339 
for control region, 77 
Morphometrics, 70 
Morus bassana, 164 
Motacilla, 225 
Mountains 
and altitudinal replacement, 326 
barriers, 308 
folding, 328 
MtDNA, see Mitochondrial DNA 
Museum collections, 345 
Mutagens, 217 
Mutation, recurrent, 255 
Myoglobin, 13 


ND6/tRNA®®, 357 
Neognathae, 180, 214, 228—230 
Neornithes, 180 
Nest building, 293—294 
Neutral evolution, 67, 262 
New Zealand, 362 
marine transgression, 364 
Nothura maculosa, 351 
Nothocrax urumutum, 177 
Nothoprocta cinerescens, 176 
N. perdicaría, 177 
Nuclear copies of mitochondrial genes, 17—22, 
220 
in ancient DNA, 347 
use as an outgroup, 22, 241 
Null alleles, 40 
Numt, see Nuclear copies of mitochondrial genes 


Oceanodroma leucorhoa, 164 
Оһ, 59 
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Open reading frame, 6, 58 

Opisthocomus hoazin, 219, 230—232 

Origin of mtDNA replication, 7—8, 10 
as a phylogenetic character, 14 

Orthology, 21—22 

Oscines, 86 

Osprey, 240 

Otus, 225 

Outgroup selection, 78, 219 

Ovis, 42 


Pachyornis elephantopus, 176 
Paleoecology, 362 
Paleognathae, 173—192, 214, 350 
Pandion, 240 
Panmixia, 70 
Paralogy, 21—22, 214 
Parapatry, 326 
Parasitic cowbirds, 320 
Parentage, 36 
Parsimony 
effects of taxon sampling on, 255—257 
preference for, 215 
Parus, 288—291 
Passerella iliaca, 308, 315, 317 
Passeriformes, 53, 228—230, 233 
Paternal leakage of mtDNA, 5 
PCR, see Polymerase chain reaction 
Pelecaniformes, 159—169 
Pelecanus erythrorhynchus, 164 
Phaethon lepturus, 164 
Phalacrocorax auritus, 164 
Phasianus, 234—235 
Phenotypic variation, 291 
Philopatry, 70 
Phimosus infuscatus, 127 
Phoenicopterus ruber, 219, 231, 233 
Phylogenetic constraints, 291-295 
Phylogenetic framework approach, 283 
Phylogenetic information from transversions, 101 
Phylogenetic signal, 255, 272 
Phylogeny 
accuracy, 281—283 
reconstruction 
amount of sequence required, 86, 103 
and weighting, 12, 130, 217-219 
recovery of "known" phylogenies, 218 
recovery of true phylogeny, 86—87 
value in interpreting evolution of behavior, 
280 
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Phylogeography, 37, 66, 74, 302 
Picoides, 92, 309 
Piculets, 92 
Piculus, 92, 103—104 
Picumnus aurifrons, 92 
Pileated woodpecker, see Dryocopus pileatus 
Pipilo fuscus, 308 
Pipridae, 288 
Plants, 266 
Pleiotropy, 291 
Plumage evolution, 70-73 
Podica, 122 
Pogoniulus bilineatus, 91 
Polioptila, 308, 309 
Polymerase chain reaction 
amplification of contaminants, 15, 87, 127, 349 
and ancient DNA, 15, 346 
“Jumping PCR,” 22 
Polymorphism, intraspecific, 73—74, 253-258 
Pomatostomus, 10, 55, 74-75 
D. temporalis, 75-77, 253, 264, 269 
Population structure, 36 
and degree of geographic isolation, 311—313 
assessing, 52, 303 
of dunlins, 66—67 
of Pomatostomus temporalis, 75 —77 
Primers, degenerate positions, 220 
Privative groups, 162 
Progne subis, 294 
Pseudogenes, 10, 16, 22 
Psophia, 122, 130 
Psophii, 123, 141 
Prerochelidon, 293, 294 
Pterocles, 230 
Ptilonorhynchus violaceus, 89 
Ptiloris paradiseus, 91 
Puffinus tenuirostris, 164 
Purple martin, 294 
Pygoscelis adeliae, 54, 75—77 


Rallidae, 124 
Rallus, 122, 127 
Ramphastos tucanus, 91 
Range expansions, 318 
Rate of molecular evolution 
compared to non-molecular evolution, 192, 
303 
effects on phylogenetic reconstruction, 217— 
219 
in control region, 66 
in mt 128 rDNA, 223-225 


of nonsynonymous substitutions, 272 
relative rate tests, 94—95, 144, 351 
variation in, 104, 133, 148, 151 
Ratites, 173—192, 349—362 
125 rDNA, 121—153, 219—242 
compensatory substitutions, 145—147 
domain II, 149—150 
domain Ш іп ratites, 351 
insertions and deletions, 147 
secondary structure, 125—126, 144—152, 221 
222 
stem differences between mammals and birds, 
145 
stem migration, 140 
substitution rates, 223 
16S rDNA, 175, 227 
Recombination, 7 
Red-winged blackbird, see Agelaius phoeniceus 
Refuge theory of speciation, 67, 77, 332—333, 
339 
Relatedness, 41-42 
Replication, 217 
slippage, 8, 10, 34, 39, 147 
Restriction fragment length polymorphisms, see 
Mitochondrial haplotypes 
Rhea americana, 176, 177, 219, 226 
Rhynchotus, 176 
Rhynochetos, 122 
Roatelos, see Mesitornis 
Rooting 
effect on topology, 189 
problems, 227—228, 241—242 
Ruff, 41 


Sagittarius, 140, 225, 235-240 
Sandgrouse, see Pterocles 
Sandpipers, 41, 268 
Sapsuckers, 91—92 
Saturation, 11—12, 78, 100 
Screamers, 235 
Scytalopus, 17, 21—22, 333—334 
Sea lamprey, 6—7 
Seal, gray, 42 
Secondary structure 
control region, 8 
12S rRNA, 125—126, 144-152, 221-222 
Secretary bird, see Sagittarius 
Selection, 36, 217, 271 
balancing, 265, 271, 272 
for compact mt genomes, 6 
kin, 75 


overcoming gene flow, 73 
positions constrained by, 106, 217 
role in gene rearrangement, 13 
sexual, 43, 313 
stabilizing, 291 
Sensory bias hypothesis, 295 
Seriema, see Cariama 
Sex determination, 34—35 
Sheep, 42 
Sittidae, 290 
Snake, 14, 16 
Snow goose, see Anser caerulescens 
Song sparrow, see Melospiza melodia 
Speciation, 70, 322 
allopatric, 334 
due to marine transgression, 364 
parapatric, 334 
vicariant, 77, 334 
Spectral analyses, 353, 355 
Spheniscus magellanicus, 164 
Sphenodon punctatus, 10, 14—15 
Sphyrapicus varius, 91 
Spinetails, 339 
Spizella passerina, 315, 317, 318 
Spotted tinamou, 351 
Squamata, 14 
Star phylogeny, 150, 190, 267 
Steganopody, 160, 162 
Stiff-tailed ducks, 235 
Stone curlew, 140 
Stop codon, incomplete in chicken, 10 
Strigidae, 241 
Strigiformes, 235-237 
Struthio camelus, 176, 177 
Struthionoidea, 183 
Subfossils, 365 
Suboscine, 90 
Subspecies, 69, 72-74, 309 
Substitutions 
expected probability, 149 
correcting for unseen, 255 
Sun bittern, see Eurypygia 
Sun grebe, see Heliornis 
Swallows, 293-294 
Synapomorphies, distribution of, 149—152 
Syrigma sibilatrix, 285 


Tandem duplication, 7, 20, 53 

Tapaculo, see Scytalopus 

Taxon sampling, 218—219, 253—258, 346 
Tectonic changes, 328—331 
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АТ,.Н, 100—102, 108, see also DNA hybridization 
analysis problems, 214—215 
and molecular clock, 107 
reference scale, 100 

Thalassornis, 235 

Three-toed woodpecker, 309 

Tinamus, 176, 177 

Total data approach, 218 

Total evidence approach, 218 

Toucans, 91 

Toxostoma, 308 

Transition bias, 64, 258 

Transition-transversion ratios, 96—97, 130—131 
correcting for bias in, 99 
estimation for avian species, 100, 260 
in chaffinches, 71 
inference by pairwise comparison, 95, 98-99 
in Paleognaths, 184 
instantaneous rates, 99, 130 
phylogenetic information, 98—99, 101 
statistical problems, 97 
weighting, 130, 223, 258 

tRNA, 176 

tRNA, rearrangement, 6, 8, 16 

Trumpeters, see Psophia 

Tuatara, see Sphenodon punctatus 

Turdus migratorius, 301 

Титіх, 122, 127, 219, 231-232 

Turnstone, 54 


Unrooted networks, 215, 226-227 
Uria, 54, 139 


Variable number tandem repeats, 31, 52 
Veniliornis, 92 

Verdin, 308 

Vicariance, 314, 332, 336, 338, 339 
Vicariant barrier, 269 

Vidua chalybeata, 219, 226 

VNTR, see Variable number tandem repeats 


Weighting, 33, 148, 222, see also Phylogenetic 
reconstruction 
polymorphic sites, 258 
positions, 131—132 
transformation, 130-131 
transition-transversion, 130, 223— 224, 258 
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Whistling duck, 235 

Whistling heron, 285 
White-backed duck, 235 

Wing elements in ratites, 186—187 
Woodpeckers, 92, 100, 107 
WTSUBS, 133 


Xenopus, 8, 53, 58 


Zosterops, 266 


